MDCS Troubleshooting

From HPC users
Jump to navigationJump to search

1. Data too large to be saved.: In some cases you may encounter a problem when your MDCS workers generate a lot of data during the computation. It is not entirely clear what the exact limit is, most likely it is 4GB (combined for all workers). If you see an error similar to this:

Error using parallel.Job/load (line 36)
Error encountered while running the batch job. The error was:
The task result was too large to be stored.

Caused by:
     Error using parallel.internal.cluster/FileSerializer>iSaveMat (line 278)
     Data too large to be saved.

you could try to save parts (or all) of the workspace to a file, e.g. on $WORK by adding this to the end of your Matlab program:

path = getenv('WORK');   % if needed modify path to add subdirectories in WORK
file = fullfile(path, 'bigdata.mat')
save( file, 'bigdata')
clear bigdata

Here bigdata is an array (or similar) holding a lot of data. The data is stored to a file and then removed from the workspace. As you can mount WORK also on your local computer, you can even load the data from the file in you local Matlab session easily.

More troubleshooting tips can be found on the offical Matlab website, also do not hesitate to contact Scientific Computing with any problem you may encounter.