Computation Time

Computation Time

Simulating a 100kb Sequence

Among the main contributors to the computational cost of a forward simulation are the number of individuals (N) that are simulated (which increases both the number of generations as well as the number of computations per generation) and the length of the sequence (which determines, among other things, the number of mutation and recombination events that take place each generation).

In the following table, I report the time (in seconds) for a single computation (10N generations) with a fixed θ (4Neμ) = ρ (4Ner) = 0.001/site in a population of constant size N diploid individuals (2N chromosomes) across a sequence of length L basepairs. Values within a column represent the direct impact of increasing the population size (as the number of mutation and recombination events are held fixed).

The terminal row and column (labeled RI) show the mean relative increase in time due to increasing the number of individuals and sequence length (respectively). It is calculated as AVG[(ti xi)/(ti-1 xi-1)], where ti is the time in cell entry i and xi is the size corresponding to row or column i. Notice that RI tends to decrease as you move down the column or across the row.

These computations were performed on system with dual quad-core 2.8Ghz processors running Mac OSX (v10.5).

Memory Requirements

./sfs_code 1 10 -A -L <N> <100000/N> -o out100kb_10.txt -r 0.001 -a N -b 5