Replies: 5 comments 2 replies
-
Also, it may be nice to integrate some examples on running on jupyter notebooks as professor Alex Tait did @atait |
Beta Was this translation helpful? Give feedback.
-
Just log onto the remote server (with ssh or similar) and then run Do you mean interactively with Jupyter notebooks? You can do this in a variety of ways, e.g. by tunneling through ssh. |
Beta Was this translation helpful? Give feedback.
-
@oskooi has a lot of experience running Meep on AWS and similar cloud servers, if that's what you mean. |
Beta Was this translation helpful? Give feedback.
-
I think @joamatab is referring to running meep on systems (like Google Cloud) where you may not have access to the same hardware for an extended period of time (similar to our discussion with @ianwilliamson some time ago). Specifically, platforms like this often require a slightly more "dynamic" job manager (running Although even with Google Cloud, you should be able to run your script on multiple processes using MPI. I've had success with this on a variety of (commercial) clusters. |
Beta Was this translation helpful? Give feedback.
-
In practice, the statement that "MPI is quite brittle to worker failure" is an issue typically when the cluster is configured with a large number of virtual machines (i.e., ~10+) running for a long period of time (i.e., several hours or days). Obviously, the larger the number of VMs and the longer they are running, the higher the probability that one of these VMs could be preempted. Note that, in any circumstance in which the job may need to be restarted, Meep supports checkpointing the simulation state. Thus, for MPI jobs in the public cloud, it is good practice to set up your job to periodically dump its state to persistent storage. Combined with this checkpointing, the absence of fault tolerance in the MPI standard can be mitigated in most public clouds in another simple way: for jobs with a fixed number of Meep chunks (or CPU cores), reconfigure the cluster to use a smaller number of nodes but with a larger number of vCPUs per node. A large degree of runtime customization is a key feature of all major public clouds. As an example using GCP, a parallel simulation requiring 224 cores or MPI processes can be executed using: (1) two nodes of Based on this, I think practically all the pieces are there now to be able to scale up Meep jobs in the public cloud to arbitrary size problems. |
Beta Was this translation helpful? Give feedback.
-
What is the recommended way to run meep on a remote server?
some ideas for plugins or demos to add to meep:
GCP announced https://cloud.google.com/run/docs/overview/what-is-cloud-run#jobs
we could add a silicon photonics labrad demo running meep on the cloud
https://github.com/GoogleCloudPlatform/rad-lab/tree/main/modules/silicon_design
@proppy
@flaport
@simbilod
@HelgeGehring
Beta Was this translation helpful? Give feedback.
All reactions