Comparing SFS_CODE to Coalescent and Poisson Random Field Expectations

 

The demographic model (described in words) is as follows:  The ancestral human population expanded 1.8-fold 0.63×2Na generations ago (where Na is the ancestral effective population size).  Then, 0.27×2Na generations ago, the European population left Africa with an effective population size just 0.15×Na.  After another 0.21×2Na generations, the European population went through another 2-fold bottleneck, but has since been recovering with exponential growth.  This model has complicated dynamics, with multiple size changes, making it very useful for comparing SFS_CODE to ms.  The command-lines for this demographic model are rather large, with both SFS_CODE and ms commands consuming 2 lines of text.


sfs_code 2 15000 -n 10 -r 0.001 -TS 0.359 0 1 -TE 0.629 -Td 0 P 0 1.815
-Td 0.359 P 1 0.158 -Td 0.566 P 1 0.523 -Tg 0.566 P 1 49.492


ms 40 15000 -t 5 -r 5 5000 -I 2 20 20 -ej 0.134 2 1 -n 1 1.815 -n 2 3.369
-eg 0 2 99.071 -en 0.031 2 0.286 -en 0.314 1 1


First I show plots akin to the 1-population analysis, with the marginal number of SNPs (upper left), π (upper right), number of haplotypes (lower left) and marginal site-frequency spectrum for a given population (lower right).  First up:  YRI population.

 
  1. 1.No migration

  1. 2.With migration

Finally, I add migration to the two population dynamics.  Rather than include a constant rate of migration throughout the demographic history, however, there are two phases.  Immediately after the split of the two populations, there is limited migration.  However, after the second bottleneck, the migration rates increase.  Both commands are presented here for completeness.


sfs_code 2 15000 -n 10 -r 0.001 -TS 0.359 0 1 -TE 0.629 -Td 0 P 0 1.815
-Td 0.359 P 1 0.158 -Td 0.566 P 1 0.523 -Tg 0.566 P 1 49.492
-Tm 0.359 P 0 1 6.340 -Tm 0.359 P 1 0 1.00186 -Tm 0.566 L 0.759 0.0628


ms 40 15000 -t 5 -r 5 5000 -I 2 20 20 -ej 0.134 2 1 -n 1 1.815 -n 2 3.369

-eg 0 2 99.071 -en 0.0314 2 0.286 -en 0.314 1 1 -ma x 0.837 0.837 x 

-ema 0.0314 2 0 6.985 6.985 0


I now show figures that are equivalent to the ones above, but for the case of migration.  First the marginal YRI dynamics.

Finally, the joint site-frequency spectra for the case of migration.

Next, I show the same plots for the CEU population.

Next I show the joint site-frequency spectra.  The top two panels show the number of SNPs (colored using a heat-map) at frequency (i,j), indicating that i chromosomes from the YRI sample and j chromosomes from the CEU sample carry the mutation (0 ≤ i,j ≤ 20).  The lower left panel shows the distribution of the residuals (SFS_CODEi,j - msi,j)/sqrt(SFS_CODEi,j).  The lower right panel shows the distribution of residuals.  Note that they are centered about zero, which largely indicates that the two programs are generating equivalent results.  Note that slight color differences in the two upper panels indicated slightly different cutoffs used by the image function in the R statistical package.

Now the marginal CEU dynamics.