DFTD3 in MPI
When running GPAW
, one usually calls the pythonscript something along the lines of mpiexec gpaw-python run.py
, which starts the python script in parallel.
Now, if the GPAW
calculator object is connected to a DFTD3
object, the call to run the DFTD3
executable is also run on every node, meaning that we have line 234,
errorcode = subprocess.call(command, cwd=self.directory, stdout=f)
called maybe 64 times, which, luckily seems to execute correctly - later however, we have the race condition, as some nodes will reach the point where they attempt to open the file dftd3_gradient
in read_results()
, while other nodes are still in the call mentioned above - causing all sorts of weird errors. My suggestion would be, to change the call()
to use the parallel options in ASE
, which could maybe look something along the lines of
# Finally, call dftd3 and parse results.
with paropen(self.label + '.out', 'w') as f:
if world.rank == 0:
errorcode = subprocess.call(command,
cwd=self.directory, stdout=f)
else:
errorcode = None
world.barrier()
errorcode = broadcast(errorcode, 0)
and of course add from ase.parallel import paropen, world, broadcast
in the import section.