Run SNAP on supercomputer (multiple nodes)

Hi, I´m trying to run SNAP (Java API) on a supercomputer where you can reserv x number of nodes (16 cores each) for the processing. However, it seems like SNAP only uses one of the nodes (first in the list). I have tried modifying the snap.properties file (i.e., set snap.parallelism=64 and removed #) without any effect.

If anyone has some ideas about how to solve this I would be very grateful.

cheers

Martin

1 Like

It depends what kind of processing you are doing.
@lveci wrote a post answering a similar question:

Dear STEP forum users
I just came across this thread (2017; we are in 2023).
We are finally using High Performance Computers with multiple nodes and indeed, only just realized SNAP (v.9.0) and the gpt command use 1 node only, leaving all the other nodes and resources idle.

Can you help?
Can we tweak this and make it work on multiple nodes, multiple cpus-per-task etc. ?
David

SNAP does have “Remote execution”-functionality that one can access via “Tools” - I do not know if very many people have tried using it.

Usually teams with access to HPC are handing the orchestration of SNAP outside SNAP itself.

1 Like

Adjusting snap.parallelism in snap.properties might not suffice. Check for parallel processing settings or contact SNAP support for assistance.

1 Like

Hi Allene
I have the exact same problem. The suggestion kindly posted by Marcus @mengdahl is, I think, not the solution and it is my belief that the problem persists.
Not sure how to resolve this. Can the technical team at Brockmann, if possible @marpet provide alternative solutions we could explore? This would be greatly appreciated as the use of multiple cores by SNAP would speed up the processing and indeed, make full use of the technology at hand. Thank you.

SNAP can utilize all the cores on your HPC VM or physical CPU node, but it does not have other functionality for running multiple nodes than the rudimentary one available at “Remote Execution” under “Tools”. If you have great IT-support perhaps they could set up the distribution of scenes to be processed on multiple VMs as needed?

1 Like

As @mengdahl already said, SNAP is not a cluster or super-computer software, but it can be used on a cluster. Clusters can be operated differently and thus it is the task of the cluster operators to integrate SNAP. You can use Apache Hadoop or Spark or other means.

I think the remote execution tool included in SNAP was implemented and used by @kraftek. Maybe he has some hints how you can use it.

The team at Brockmann offers Calvalus which you can use to run your cluster and it has special support for SNAP.

By the way, I don’t use the @marpet account anymore (see Farewell & Welcome SNAP Community) and I don’t visit this forum as often. You can use the EOMasters Forum if you have a question for me.

2 Likes