CCV offers a high-performance storage system for research data called
RData, which is accessible as the
/gpfs/data file system
on all CCV systems.
You can transfer files between Department File Servers (Isilon) and Oscar with smbclient. To use see Copying files from Department File Servers
Note: RData is not designed to store confidential data (information about an
individual or entity). If you have confidential data that needs to be stored
CCV uses IBM's General Parallel File System (GPFS) for users' home directories,
data storage, scratch/temporary space, and runtime libraries and executables. A
separate GPFS file system exists for each of these uses, in order to provide
tuned performance. These file systems are mounted as:
|Your home directory:
optimized for many small files (<1MB)
nightly backups (30 days)
|Your data directory
optimized for reading large files (>1MB)
nightly backups (30 days)
quota is by group (usually >=256GB)
|Your scratch directory:
optimized for reading/writing large files (>1MB)
purging: files older than 30 days may be deleted
512GB quota: contact us to increase on a temporary basis
A good practice is to configure your application to read any initial input data
~/data and write all output into
~/scratch. Then, when the
application has finished, move or copy data you would like to save from
Note: class or temporary accounts may not have a
To see how much space you have on Oscar you can use the command
myquota. Below is an example output
Block Limits | File Limits
Type Filesystem Used Quota HLIMIT Grace | Files Quota HLIMIT Grace
USR home 8.401G 10G 20G - | 61832 524288 1048576 -
USR scratch 332G 512G 12T - | 14523 323539 4194304 -
FILESET data+apollo 11.05T 20T 24T - | 459764 4194304 8388608 -
You can go over your quota up to the hard limit for a grace period (14days). This grace period is to give you time to manage your files. When the grace period expires you will be unable to write any files until you are back under quota.
To transfer files from your computer to Oscar, you can use:
- command line functions like
- GUI software
Use transfer nodes for file transfer to/from Oscar:
If you have access to a terminal like on a Mac or Linux computer, you can conveniently use
scp to transfer files.
For example to copy a file from your computer to Oscar:
scp /path/to/source/file <username>@transfer3.ccv.brown.edu:/path/to/destination/file
To copy a file from Oscar to your computer:
scp <username>@transfer3.ccv.brown.edu:/path/to/source/file /path/to/destination/file
On Windows, if you have PuTTY installed, you can use its
pscp function from the terminal.
There are also GUI programs for transfering files using the SCP or SFTP protocol,
like WinSCP for Windows and
Cyberduck for Mac.
FileZilla is another GUI software for FTP which is available on all platforms.
Globus Online provides a transfer service for moving data between institutions such as Brown and XSEDE facilities. You can transfer files using the Globus web interface or the command line interface.
Nightly snapshots of the file
system are available for the trailing seven days.
Home directory snapshot
Data directory snapshot
Scratch directory snapshot
Do not use the links in your home directory snapshot to try and retrieve snapshots of data and scratch.
The links will always point to the current versions of these files. An easy way to check what a link is pointing to is to use
ls -l /gpfs_home/.snapshots/April_03/ghopper/data
lrwxrwxrwx 1 ghopper navy 22 Mar 1 2016 /gpfs_home/.snapshots/April_03/ghopper/scratch -> /gpfs/data/navy
If files to be restored were modified/deleted more than 7 days (and less than 30 days) ago and were in the HOME or DATA directory, you may contact us to retrieve them from nightly backups by providing the full path. Note that home and data directory backups are saved for trailing 30 days only.
Best Practices for I/O
Efficient I/O is essential for good performance in data-intensive applications.
Often, the file system is a substantial bottleneck on HPC systems, because CPU
and memory technology has improved much more drastically in the last few
decades than I/O technology.
Parallel I/O libraries such as MPI-IO, HDF5 and netCDF can help parallelize,
aggregate and efficiently manage I/O operations. HDF5 and netCDF also have the
benefit of using self-describing binary file formats that support complex data
models and provide system portability. However, some simple guidelines can be
used for almost any type of I/O on Oscar:
- Try to aggregate small chunks of data into larger reads and writes. For the
GPFS file systems, reads and writes in multiples of 512KB provide the
- Avoid using ASCII representations of your data. They will
usually require much more space to store, and require conversion to/from binary
- Avoid creating directory hierarchies with thousands or
millions of files in a directory. This causes a significant overhead in
managing file metadata.
While it may seem convenient to use a directory hierarchy for managing large
sets of very small files, this causes severe performance problems due to the
large amount of file metadata. A better approach might be to implement the data
hierarchy inside a single HDF5 file using HDF5's grouping and dataset
mechanisms. This single data file would exhibit better I/O performance and
would also be more portable than the directory approach.