What is the best way to calculate the right number of hadoop mappers and reducers to use, depending on the instances used/available on Amazon Elastic MapReduce ? (using RecommenderJob of mahout-core-0.7 distribution)
The generic Hadoop answer applies:
For EMR, look up the number of reducers that are run by default on the instance type that you're using: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HadoopMemoryDefault_AMI2.3.html
Then multiply by the number of workers you are using. That's a pretty ideal number of reducers -- or a small multiple of it even.
Until you have a specific reason to think these aren't optimal, I'd go with this.
PS Don't forget to use spot instances for your workers to save money and/or deploy more workers.
Ad break: if you are interested in Mahout, and recommendations, and running on EMR, you should probably be looking at Myrrix. I'm the founder, and also the author of some of the Mahout code you're running now. This is a "next-gen" Hadoop-based recommender product that, among other things, is already well optimized for EMR.