The project "resource theory for microprocessors" shows how the execution time depends on a number of parameters. It could be concluded as:
The best allocation algorithm casuses the time for calculations from leaf to root be lower than a particular value for all branches and this particular value should be as low as possible.
The delay time for a link depends on the load of the link. Therefore the load should be distributed to the links.
Placement could generally speaking be considered as placing components and performing wiring on printed circuit boards. This problem does not have a best solution because the problem is NP-complete. However thera are good solutions.
The problem above is for one set of input data so that there is only one graph with a particular content. At other inputs the graph is changed. In certain cases as in signal processing the graph has the same structure but different contents. In a compilation problem the graph has a quite different structure and contents.
The allocation algorithm must be design such that alternative placements for the changed graphs also results in short latency time.
Probably a random based placement of subgraphs is proper.
Placement could be performed at load time or run time. In order to control the latter one, there must be a cost function.
Different allocation algorithms are studied, primarily those based on simulated annealing.