I need to write a compute-intensive simulation program. I tried writing a multi-threaded version of this program, but it's taking too much time. Now I plan to expand to multiple nodes (probably via Amazon EC2 nodes).
I'm already familiar with Python. Is Python outfitted with some parallel module a viable option if I care about speed, or would I be better off going to some other framework/language like Erlang?
Can you even write a simulation program in Erlang?
The project is more about dividing up computation rather than dividing up a dataset, so I didn't consider frameworks based on map reduce
dispy is a framework for distributed computing with Python. It uses asyncoro, a framework for asynchronous, concurrent programming using coroutines, with some features of erlang (broadly speaking). Disclaimer: I am the author of both these frameworks.
If you are familiar with python already I would recommend you keep simulation in python (and speed up critical parts in C) and use Erlang for managing it. Writing simulation in Erlang would be a far away out of your comfort zone (even personally I would do it). You probably can reuse parts of Erlang projects as Disco project or Riak core. Start your project with some sub-optimal POC and tune it in iterations. It means start with python, embed it to Erlang (probably Disco) and then move bits around until you are not happy with performance and features. You can end up with anything including pure Erlang solution or emended Python in BEAM using NIF or anything else which satisfy your needs.
Does your problem parallelize trivially? Then you may want to take a look at Elastic Map Reduce instead of EC2.