Tuesday, March 25, 2014

A Family Tree of all mankind - part 7

Gaussian Distribution.
      Let us return to the question: "How many generations of descent are there between myself and Charlemagne, born in 741 AD? Well I jump into the tree and start counting, picking some arbitrary line of descent and I count 40. But if I do it again taking a different path, I just counted 46. I am able to determine that the maximum number of generations out of all the lines entered into my tree between Charlemagne and myself is 49. I am unable to determine the minimum with certainty, but after a good investigation I've counted a 36, but I'm going to make 35 the minimum. MOST counts come up about 42, with few coming up at either extreme. This is a Gaussian distribution! It may be skewed one way or the other, but it's too hard for me to investigate this formally. Let us hypothesize that it's symmetrical around the mean, and we can now answer the original question in a definite way. How many generations back to Charlemagne? The answer is:



      He is ALL of the above, all at the same time. He is our 33rd Great Grandfather, and he is also our 47th Great Grandfather, and everywhere in between. We should say he is our 40th Great Grandfather (42-2), since that's where he occurs the most in the pedigree. But to be completely correct and mathematically rigorous, his ancestral distance from us is a Gaussian with some mu (mean) and sigma (standard deviation). If you are not familiar with the normal distribution you should read up on it because it's extremely common everywhere.



      What about your immediate grandparents - how far are they away from you? 2 generations of course. But even that comes with a Gaussian distribution; it's just that the standard deviation of such a near ancestor is zero, because all 4 data points lie on the mean! To generalize, all ancestors, even as near as your parents, can be described as being a Gaussian distribution away from you, characterized by the 2 parameters mu and sigma. The further you go back, the greater mu and sigma both become, and the more closely the curve approximates the normal. The nearer in time an ancestor is to you, the more "lumpy" the curve will look, as it can't look very normal yet with poor resolution. I'm going to call this phenomenon in genealogy Pedigree Collapse Distribution (PCD), although I haven't ever seen that term before. Note: The true distribution is not likely to be a true gaussian, but I think that's the best way to imagine it.
      I'm not sure how the aristocracy bias inherent in my tree effects the PCD shape, but I would think it has the effect of skew. We know there is a demographic difference between the rich and the poor, and we also know the average longevity of humans (aka life expectancy) has been increasing with time. The following chart I have found online:

  
    

      So I make the assumption that the rich had a longer average life expectancy than the poor, which I think is very reasonable. I maintain that my postulated "average age between generations" of 28.5 years in my tree may still be plausible in spite of the fact that the above chart line shows the average age of everybody is less. Without the aristocracy bias, clearly it would not be possible for the average span between generations to be 28.5, when the average life expectancy of the population is 25.

      One more thing about Pedigree Collapse. There was a famous legendary debate in 1860 between Samuel Wilberforce, bishop of Oxford and T. H. Huxley who championed Darwin's evolution.
      "...the Bishop rose, and in a light scoffing tone assured us there was nothing in the idea of evolution. Then, turning to his antagonist (Huxley) with a smiling insolence, he begged to know, was it through his grandfather or his grandmother that he claimed his descent from a monkey?..."
Sounds like the bishop thought he was quite clever. Hopefully the answer is apparent enough - both.



Continued on page 8.
page 1,2,3,4,5,6,7,8,9 page

No comments:

Post a Comment