Pseudogenes

 

Sean D. Pitman M.D.

© January 2001

Latest Update: November 2008

 

 

 

 

Table of Contents

Home

 

 

 

 

 

Pseudogenes are DNA sequences that resemble functional genes but are generally thought to have no purpose.  In fact many scientists think that pseudogenes are nothing more than discarded genetic fossils of a bygone era when they did have some sort of important function.  Of course, it logically follows that similar pseudogenes that are shared by different species give evidence of common ancestry and even potential times of divergence.11  For example, the eta-globin pseudogene, which is found in both humans and chimps, has been used as an argument for the common ancestry of the two species.

The first pseudogene was reported in 1977.1  Since that time, a large number of these genes have been reported and described in humans and many other species.  

There are two types of pseudogenes known as "processed" and "unprocessed" pseudogenes.2,11     

Processed genes are found on different chromosomes from their functional counterparts.  They lack introns and certain regulator genes, often terminate in adenine series, and are flanked by direct repeats (which are associated with movable genetic elements).  They may be complete or incomplete copies of genes or mixtures of several genes.  They are believed to have occurred through a 3-step process:  Copying DNA into RNA, editing the introns to make mRNA, and then turning the code in the mRNA back into DNA through a reverse transcription process.  This process is thought to have created the "L1 family of pseudogenes."2  Other theories include retroviruses as means of pseudogene transport between different organisms. 

Unprocessed pseudogenes are usually found in clusters of similar functional sequences on the same chromosome.  They usually have introns and associated regulatory sequences.  Their expression is usually prevented by a "misplaced" stop codon or codons.  There may be other changes from the "original" as the result of deletions, insertions, and point mutations.  Some form of mRNA may or may not be produced depending on the damage to the gene.  Many of these are believed to have arisen by gene duplication, which produced an extra copy of the gene.  The extra copy could then accumulate mutations without harming the organism since it would still have a completely functional original copy.2  (The evolutionary gene duplication hypothesis suggests that over time, random mutations may produce a new gene with new functions by using this gene duplicate while maintaining the original gene funtion5).

 

 

Shared Pseudogenes

 

It is felt by many, especially evolutionary biologists, that shared pseudogenes, which have no function in any form in different species, are examples of common ancestry.  Comparison of DNA sequences from humans, chimps, and other mammals shows a great number of shared pseudogenes.   Perhaps the best-known example of a shared pseudogene is the eta-globin gene.

The eta gene is located on chromosome 11 in humans and is fourth in a series of 6 beta globin genes (five are functional).4  It has no start codon (AUG) and it has several stop codons.  So obviously, no mRNA is made and therefore no protein.  Humans, chimps, and gorillas have the same number of beta-globin genes arranged in the same sequence.  The exon sequences within these genes are also similar - as are the exons of the eta gene.4  It is thought that the eta-globin gene originated by a duplication of the gamma-A-globin gene because of the high similarity of the sequences.  Also, both genes are present in primates.  

The history of the eta-globin pseudogene is thought to have originated some 140 million years ago in marsupials and placental mammals.  After the "evolutionary divergence" of marsupials, the gamma-globin gene formed by duplication of an existing gene in the beta-globin family.  Later, but before radiation of the orders of placental mammals, the eta-globin gene formed from a duplication of the gamma-globin gene.  Gamma and eta genes must therefore have been present in ancestral placentals, but presumably gamma was lost by goats (which do not have gamma) and eta was lost by rabbits (which do not have eta).  

According to this scenario, the eta gene must have been functional at first, because it is functional in goats today. 2   It is non-functional in all primates, which is interpreted to mean it was already non-functional in ancestral primates some 70-80 million years ago.  This interpretation implies that the eta-globin gene has been maintained for more than 70 million years without being converted to a useful new gene and without being eliminated through random mutations.

 

 

Signs of Function?

  

So, the persistence of a non-functional DNA sequence in an entire lineage for such a supposed long period of time seems remarkable in the context of the gene duplication hypothesis.  The very fact that pseudogenes are still present and recognizable after tens of millions of years without any beneficial function just doesn't seem to make sense.  Certainly, without some beneficial function, natural selection would not have maintained their sequences for such long periods of time.  There is in fact a cost to maintaining non-functional DNA.  It takes energy to replicate and maintain DNA that doesn't pay for its keep.  Although this cost might seem small over the short term.  Even an extremely small cost compounded over the course of millions of generations starts to turn into a significant disadvantage.   So, the fact that pseudogenes have any recognizable gene-like structure at all suggests that they do in fact serve some kind of purpose.  

 

   The persistence of pseudogenes is in itself evidence for their activity.  This is a serious problem for evolution, as it is expected that natural selection would remove this type of DNA if it were useless, since DNA manufactured by the cell is energetically costly.  Because of the lack of selective pressure on this neutral DNA, one would expect that ‘old’ pseudogenes would be scrambled beyond recognition as a result of accumulated random mutations.  Moreover, a removal mechanism for neutral DNA is now known.6

 

 

 

 

     “Typically when people say that the human genome contains 27,000 genes or so, they are referring to genes that code for proteins,” points out Michel Georges, a geneticist at the University of Liège in Belgium. But even though that number is still tentative—estimates range from 20,000 to 40,000—it seems to confirm that there is no clear correspondence between the complexity of a species and the number of genes in its genome. “Fruit flies have fewer coding genes than roundworms, and rice plants have more than humans,” notes John S. Mattick, director of the Institute for Molecular Bioscience at the University of Queensland in Brisbane, Australia. “The amount of noncoding DNA, however, does seem to scale with complexity.". . . 

        "Increasingly we are realizing that there is a large collection of ‘genes’ that are clearly functional even though they do not code for any protein” but produce only RNA, Georges remarks. The term “gene” has always been somewhat loosely defined; these RNA-only genes muddle its meaning further. To avoid confusion, says Claes Wahlestedt of the Karolinska Institute in Sweden, “we tend not to talk about ‘genes’ anymore; we just refer to any segment that is transcribed [to RNA] as a ‘transcriptional unit.’” Based on detailed scans of the mouse genome for all such elements, “we estimate that there will be 70,000 to 100,000,” Wahlestedt announced at the International Congress of Genetics, held this past July in Melbourne. “Easily half of these could be noncoding.” If that is right, then for every DNA sequence that generates a protein, another works solely through active forms of RNA—forms that are not simply intermediate blueprints for proteins but, rather, directly alter the behavior of cells.” . . . 

        “I think this will come to be a classic story of orthodoxy derailing objective analysis of the facts, in this case for a quarter of a century,” Mattick says. “The failure to recognize the full implications of this particularly the possibility that the intervening noncoding sequences may be transmitting parallel information in the form of RNA molecules—may well go down as one of the biggest mistakes in the history of molecular biology." [emphasis added] 16

 

 

 

Given this, it is not known if all of what are currently thought of as pseudogenes have absolutely no function.  In fact, some pseudogenes are believed to function as sources of information for producing genetic diversity.  It is thought that partial pseudogenes are copied into functional genes during genetic recombination, producing variants of the functional gene.  This phenomenon has been reported many times to include various immunoglobulins within mice and birds, mouse histone genes, horse globin genes, and human beta-globin genes.  It is not known if this could be a possible role for the eta-globin gene as well.  However, the fact that the eta-globin pseudogene is located between the fetal and adult genes suggests that it might play a role in gene switching (there seems to be some preliminary evidence to this effect although the eta gene sequence’s part in this is still unknown).

It all seems like the protein coding genes are actually rather informationally simplistic (on the level of bricks and mortar for building a house) - that the real informational complexity and functionality lies in the non-coding portion of the genome (the blueprint for directing where to put the bricks and mortar for building the house).  This portion of the genome directs when and where the protein building blocks are placed and therefore is vitally important to the overall structure and ultimate function of the resulting creature.  It was because of the evolutionary bias that these non-coding regions of DNA were assumed to be junk for so long - and therefore overlooked and unrecognized as key informational components in the genome. Interestingly enough, such findings actually support the predictions of intelligent design theory while countering long-held evolutionary assumptions. Of course, there are always ad hoc modifications to explain such failed predictions resulting from an evolutionary bias.

 

 

One Man's Junk . . .

 

Other pseudogenes and so-called transposons, such as the “Alu element” (once thought to be completely useless), are being found to have important functions. 

 

There is a growing body of evidence that Alu (a SINE – Short Interspersed Nuclear Element) sequences are involved in gene regulation, such as in enhancing and silencing gene activity, or can act as a receptor-binding site… This is surely a precedent for the functionality of other types of pseudogenes. 6, 7

   

In 1997 Flam et al published an article in the journal Science suggesting that "junk-DNA" seemed to be set up very similar to a language system - like a human language system. "The authors of the paper employed linguistic tests to analyze junk DNA and discovered striking similarities to ordinary language. The scientists interpret those similarities as suggestions that there might be messages in the junk sequences, although its anyone's guess as to how the language might work." 31 This is especially interesting because this same sort of argument would be used as evidence of extraterrestrial intelligence (like "ET") if such a language-like pattern were found in any other media - like radiowaves or etchings on Marian rocks.

Around 1998 Carl Schmid, a molecular biologist at the University of California at Davis, started advancing what seemed like a nutty idea to explain Alu’s unusual affinity for genes.  Schmid suggested Alu sequences resided near genes because they are not really “junk” sequences, but are rather useful sequences involved with a mechanism that helps cells repair themselves.  With the entire genome map in front of them, showing so many instances of Alu sequences around genes, scientists are beginning to take Schmid seriously.  “It looks pretty convincing,” Francis Collins said.  Others such as M.I.T. geneticist Eric Lander agree.8

More recently in 2001, a team of molecular geneticists discovered two “hot spots” where the same SINEs inserted independently:

 

  Vertebrate retrotransposons have been used extensively for phylogenetic analyses and studies of molecular evolution. Information can be obtained from specific inserts either by comparing sequence differences that have accumulated over time in orthologous copies of that insert or by determining the presence or absence of that specific element at a particular site.  The presence of specific copies has been deemed to be an essentially homoplasy-free phylogenetic character because the probability of multiple independent insertions into any one site has been believed to be nil. . . . We have identified two hot spots for SINE insertion within mys-9 and at each hot spot have found that two independent SINE insertions have occurred at identical sites.  These results have major repercussions for phylogenetic analyses based on SINE insertions, indicating the need for caution when one concludes that the existence of a SINE at a specific locus in multiple individuals is indicative of common ancestry.  Although independent insertions at the same locus may be rare, SINE insertions are not homoplasy-free phylogenetic markers.9

 

 

Even more recently, in the May 2003 issue of Nature, Jeannie Lee published an article entitled, "Complicity of Gene and Pseudogene" in which some interesting findings from work done by Hirotsune et al.13 were presented:

 

Dysfunctional in the sense that they cannot be used as a template for producing a protein, pseudogenes are in fact nearly as abundant as functional genes.  Why have mammals allowed their accumulation on so large a scale?  One proposed answer is that, although pseudogenes are often cast as evolutionary relics and a nuisance to genomic analysis, the processes by which they arise are needed to create whole gene families, such as those involved in immunity and smell.  But, are pseudogenes themselves merely byproducts of this process?  Or do apparent evolutionary pressures to retain them [natural selection] hint at some hidden biological function?  For one particular pseudogene, the latter seems to be true . . . Hirotsune and colleagues report the unprecedented finding that the Makorin1-p1 pseudogene [located on chromosome 5 in mice] performs a specific biological task [it regulates the expression of the Makorin1 gene which is located on a completely different chromosome - chromosome 6 in mice].

The work of Hirotsune et al. is provocative for revealing the first biological function of any pseudogene.  It challenges the popular belief that pseudogenes are simply molecular fossils -- the evidence of Mother Nature's experiments gone awry." 12,13

 

    In yet another recent Science article by Wojciech Makalowski, the following comments are made that seem to echo what design theorists have been saying for a very long time:

 

     Although catchy, the term "junk DNA" for many years repelled mainstream researchers from studying noncoding DNA.  Who, except a small number of genomic clochards, would like to dig through genomic garbage?  However, in science as in normal life, there are some clochards who, at the risk of being ridiculed, explore unpopular territories.  Because of them, the view of junk DNA, especially repetitive elements, began to change in the early 1990s.  Now, more and more biologists regard repetitive elements as genomic treasure." 14

   

       Then, as recently as the December 2003 issue of Annual Review of Genetics, Balakirev and Ayala published a paper entitled, "Pseudogenes: Are They 'Junk' or Functional DNA?"  Consider just a few of their conclusions and see if they do not again remind you of what design theorists have been claiming for a long time  - -  That pseudogenes surely have important functions and therefore are not really "pseudo" after all:

 

      Pseudogenes have been defined as nonfunctional sequences of genomic DNA originally derived from functional genes. It is therefore assumed that all pseudogene mutations are selectively neutral and have equal probability to become fixed in the population. Rather, pseudogenes that have been suitably investigated often exhibit functional roles, such as gene expression, gene regulation, generation of genetic (antibody, antigenic, and other) diversity. Pseudogenes are involved in gene conversion or recombination with functional genes. Pseudogenes exhibit evolutionary conservation of gene sequence, reduced nucleotide variability, excess synonymous over nonsynonymous nucleotide polymorphism, and other features that are expected in genes or DNA sequences that have functional roles. . .

       An extensive and fast-increasing literature does not justify a sharp division between genes and pseudogenes that would place pseudogenes in the class of genomic "junk" DNA that lacks function and is not subject to natural selection. Pseudogenes are often extremely conserved and transcriptionally active. . .

       There seems to be the case that some functionality has been discovered in all cases, or nearly, whenever this possibility has been pursued with suitable investigations. One may well conclude that most pseudogenes retain or acquire some functionality and, thus, that it may not be appropriate to define pseudogenes as nonfunctional sequences of genomic DNA originally derived from functional genes, or as "genes that are no longer expressed but bear sequence similarity to active genes". Rather, pseudogenes might be defined as DNA sequences derived by duplication or retroposition from functional genes that are often subject to natural selection and therefore retain much of the original sequence and structure because they have acquired new regulatory or other functions, or may serve as reservoirs of genetic variability.15

 

 

Identical Human-Mouse Junk DNA?- Lots of It?

 

Then, in May of 2004 Haussler and Bejerano used computers to compare the human genome with the mouse and the rat genomes.  They assumed that because humans, mice, and rats look so different, there would be differences in the genome. They did see the expected differences in the shared genes from the assumed 'common ancestor', but they were surprised to find long stretches of shared non-coding "junk" DNA that were exactly the same in humans and rodents. 

 

"There were about five hundred stretches of DNA in the human genome that hadn't changed at all in the millions and millions of years that separated the human from the mouse and the rat," says Haussler. "I about fell off my chair. It's very unusual to have such an amount of conservation continually over such a long stretch of DNA."32

 

Many of these stretches of DNA, called "ultraconserved" regions, don't appear to code for protein, so they might have been dismissed as junk if they hadn't shown up in so many different species.  Haussler "confirmed that negative selection is three times stronger in these regions than it is for nonsynonymous changes in coding regions."  As far as Haussler is concerned, "It is a mystery what molecular mechanisms would place virtually every base in a segment of size up to 1 kilobase [i.e., 1000 bp] under this level of negative selection" ( Link ). That's 500 regions of DNA up to 1000 bp that are identical between rats and humans - up to 500,000 identical genetic sites in DNA?!  What is also surprising is that these same regions largely matched up with chicken, dog and fish sequences as well; but are absent from sea squirt and fruit flies.  Note that the last supposed common ancestor for all of these creatures was thought to live some 400 million years ago ( Link ).  Of course, it is only logical to assume that if nature has gone to so much trouble to preserve these ultraconserved regions over all these years, then they must be more important than just 'junk.'  Haussler thinks the most likely scenario is that they control the activity of indispensable genes and embryo development.

"From what we know about the rate at which DNA changes from generation to generation, the chance of finding even one stretch of DNA in the human genome that is unchanged between humans and mice and rats over these hundred million years is less than one divided by ten followed by 22 zeros. It's a tiny, tiny fraction. It's virtually impossible that this would happen by chance." 32

Of course, Haussler still believes very strongly that humans and rats do in fact share a common ancestor that lived a hundred million years ago or so.  The idea that perhaps humans and rats might have actually been individually created, deliberately, does not even cross his mind.  Even so, this discovery suggests that the genomes of both humans and rats must be doing something other than coding for proteins, but the purpose of these ultraconserved regions remains a mystery. Haussler thinks that solving this mystery might unlock the secrets of diseases like autism and epilepsy. "There are many cases that are unexplained by any changes in the genes," says Haussler. "This is a new area to look. Doctors who have patients where they have collected DNA samples can look for something common in all of those DNA samples that might explain what is going wrong with their patients— how does the DNA from their patients differ from the DNA of other people who don't have the disease? You look for the consistent difference. These places are a great place to look for some of the diseases that we are still mystified about." 32   

 

Haussler concludes with the following understatement: "I think other bits of 'junk' DNA will turn out not to be junk. I think this is the tip of the iceberg, and that there will be many more similar findings." ( Link )  By 2007 Haussler and Bjerano found 10,402 sequences or tansposons that showed signs of function.  "We used to think they were mostly messing things up. Here is a case where they are actually useful," Bejerano said. 33  

And the count of functional genetic elements once thought to be 'junk' continues to expand at an almost exponential rate . . .

 

 

 

Shared Mistakes

 

 

Another interesting argument is that various pseudogenes in different species often have certain shared "mistakes" -  that "must have originated in a common ancestor." 11  However, there is some evidence that nucleotide changes may not be completely random in certain gene locations.  Mutational "hotspots" have been identified in many genes as well as pseudogenes.  In these locations, point mutations, even specific types of point mutations, are much more common than elsewhere in the gene.  

 

 

Consider the GULOP (or GULO) pseudogene for example.  In most mammals this is an active gene encoding the enzyme L-glucono-γ-lactone oxidase (LGGLO). GULO is located on chromosome 8 at p21.1 in a region that is rich in genes (see figure). This is the enzyme that catalyzes the last step in the synthesis of ascorbic acid (vitamin C). As it turns out, this particular gene is defective in humans and other primates as well as several other creatures to include guinea pigs, bats and certain kinds of fish. Compared to the rat GULO gene, the human version, as well as the great ape version, has large or clearly functional deletions involving exons I-III, V-VI, VIII, and XI (see figure above).18-21   Compare this with the significant deletions of the guinea pig GULO sequence that involve exons I, V, and VI - - all of which match the same losses of the primate mutations.  In addition to this, all four functionally detrimental stop codons (3TGA and 1TAA sequences) that are identified in the guinea pig are shared at the same sites locations in the primate GULO pseudogene.

 

Of course, it seems that we humans are able to get along just fine without this gene because we eat a lot of foods that are rich in vitamin C, like citrus fruits.  So, what's the big deal?  Well, the argument goes something like this (as per a popular Talk.Origins essay by Edward E. Max, Ph.D.):

 

In most mammals functional GLO genes are present, inherited - according to the evolutionary hypothesis - from a functional GLO gene in a common ancestor of mammals. According to this view, GLO gene copies in the human and guinea pig lineages were inactivated by mutations. Presumably this occurred separately in guinea pig and primate ancestors whose natural diets were so rich in ascorbic acid that the absence of GLO enzyme activity was not a disadvantage--it did not cause selective pressure against the defective gene.

Molecular geneticists who examine DNA sequences from an evolutionary perspective know that large gene deletions are rare, so scientists expected that non-functional mutant GLO gene copies--known as "pseudogenes"--might still be present in primates and guinea pigs as relics of the functional ancestral gene. . . [Beyond this],  the theory of evolution would make the strong prediction that primates [like apes and monkeys] would carry similar crippling mutations to the ones found in the human pseudogene. A test of this prediction has recently been reported. A small section of the GLO pseudogene sequence was recently compared from human, chimpanzee, macaque and orangutan; all four pseudogenes were found to share a common crippling single nucleotide deletion that would cause the remainder of the protein to be translated in the wrong triplet reading frame (Ohta and Nishikimi BBA 1472:408, 1999). 11,20

 

 

Now, it is interesting that among the many various substitution mutations in the "GLO" pseudogene that many, though not all, would be shared, to include a single deletion mutation that is shared by all primates (when compared to the rat of course).  If not for common descent why would the sequences of human, chimpanzee, gorilla and orangutan reveal a single nucleotide deletion at position 97 in the coding region of Exon X? What are the odds that out of 165 base pairs the same one would be mutated in all these primates by random chance?  Pretty slim - right?  Is this not then overwhelming evidence of common evolutionary ancestry?

This would indeed seem to be the case at first approximation. However, in 2003, the same Japanese group published the complete sequence of the guinea pig GLO pseudogene, which is thought to have evolved independently, and compared it to that of humans [Inai et al, 2003]. 21 Surprisingly, they reported many shared mutations (deletions and substitutions) present in both humans and guinea pigs. Remember now that humans and guinea pigs are thought to have diverged at the time of the common ancestor with rodents. Therefore, a mutational difference between a guinea pig and a rat should not be shared by humans with better than random odds. But, this was not what was observed. Many mutational differences were shared by humans, including the one at position 97.  According to Inai et al, this indicated some form of non-random bias that was independent of common descent or evolutionary ancestry. The probability of the same substitutions in both humans and guinea pigs occurring at the observed number of positions was calculated, by Inai et al, to be 1.84x10-12 - consistent with mutational hotspots.  

 

 

 

 

What is interesting here is that the mutational hot spots found in guinea pigs and humans exactly match the mutations that set humans and primates apart from the rat (see figure below). 21,22  This particular feature has given rise to the obvious argument that Inai et al got it wrong.  Reed Cartwright, a population geneticist, has noted a methodological flaw in the Inai paper:


     "However, the sections quoted from Inai et al. (2003) suffer from a major methodological error; they failed to consider that substitutions could have occurred in the rat lineage after the splits from the other two. The researchers actually clustered substitutions that are specific to the rat lineage with separate substitutions shared by guinea pigs and humans. . . 
     If I performed the same analysis as Inai et al. (2003), I would conclude that there are ten positions where humans and guinea pigs experienced separate substitutions of the same nucleotide, otherwise known as shared, derived traits. These positions are 1, 22, 31, 58, 79, 81, 97, 100, 109, 157. However, most of these are shown to be substitutions in the rat lineage when we look at larger samples of species.
     When we look at this larger data table, only one position of the ten, 81, stands out as a possible case of a shared derived trait, one position, 97, is inconclusive, and the other eight positions are more than likely shared ancestral sites. With this additional phylogenetic information, I have shown that the "hot spots" Inai et al. (2003) found are not well supported." (see Link

 

 

 

 

 

 

 

It does indeed seems like a number of the sequence differences noted by Cartwright are fairly unique to the rat - especially when one includes several other species in the comparison. However, I do have a question regarding this point.  It seems to me that there simply are too many loci where the rat is the only odd sequence out in Exon X (i.e., there are seven and arguably eight of these loci).  Given the published estimate on mutation rates (Drake) of about 2 x 10-10 per loci per generation, one should expect to see only 1 or 2 mutations in the 164 nucleotide exon in question (Exon X) over the course of the assumed time of some 30 Ma (million years).  Therefore, the argument of the mutational differences being due to mutations in the rat lineage pre-supposes a much greater mutation rate in the rat than in the guinea pig.  The same thing is true if one compares the rat with the mouse (i.e., the rat's evident mutation rate is much higher than that of the mouse).

This is especially interesting since many of the DNA mutations are synonymous (see Link).  Why should essentially neutral mutations become fixed to a much greater extent in the rat gene pool as compared to the other gene pools? Wouldn't this significant mutation rate difference, by itself, seem to suggest a mutationally "hot" region - at least in the rat?

Beyond this, several loci differences are not exclusive to the rat/mouse gene pools and therefore suggest mutational hotspots beyond the general overall "hotness" or propensity for mutations in this particular genetic sequence.

 

 

 

 

Some have noted that although the shared mutations may be the result of hotspots, there are many more mutational differences between humans and rats/guinea pigs as compared to apes.  Therefore, regardless of hotspots, humans and apes are clearly more closely related than are humans and rats/guinea pigs. 

The problem with this argument is that the rate at which mutations occur is related to the average generation time.  Those creatures that have a shorter generation time have a correspondingly higher mutation rate over the same absolute period of time - like 100 years.  Therefore, it is only to be expected that those creatures with relatively long generation times, like humans and apes, would have fewer mutational differences relative to each other over the same period of time relative to those creatures with much shorter generation times, like rats and guinea pigs.  

What is interesting about many of these mutational losses is that they often share the same mutational changes.  It is at least reasonably plausible then that the GULO mutation could also be the result of a similar genetic instability that is shared by similar creatures (such as humans and the great apes).

This same sort of thing is seen to a fairly significant degree in the GULO region.  Many of the same regional mutations are shared between humans and guinea pigs.  Consider the following illustration yet again:

 

 

Why would both humans and guinea pigs share major deletions of exons I, V and VI as well as four stop codons if these mutations were truly random?  In addition to this, a mutant group of Danish pigs have also been found to show a loss of GULO functionality.  And, guess what, the key mutation in these pigs was a loss of a sizable portion of exon VIII.  This loss also matches the loss of primate exon VIII.  In addition, there is a frame shift in intron 8 which results in a loss of correct coding for exons 9-12.  This also reflects a very similar loss in this region in primates (see Link).  That's quite a few key similarities that were clearly not the result of common ancestry for the GULO region.  This seems to be very good evidence that many if not all of the mutations of the GULO region are indeed the result of similar genetic instabilities and that are prone to similar mutations - especially in similar animals.

 

As an aside, many other genetic mutations that result in functional losses are known to commonly affect the same genetic loci in the same or similar manner outside of common descent.  For example, achondroplasia is a spontaneous mutation in humans in about 85% of the cases. In humans achondroplasia is due to mutations in the FGFR2 gene. A remarkable observation on the FGFR2 gene is that the major part of the mutations are introduced at the same two spots (755 C->G and 755-757 CGC->TCT) independent of common descent. The short legs of the Dachshund are also due to the same mutation(s). The same allelic mutation has occurred in sheep as well.  

 

 

 

 

Real Time Molecular Convergence

 

 

Another interesting example of this phenomenon has been studied in detail in more rapidly reproducing organisms, such as viruses.  For example, an interesting study was published by Bull et al., on replicate lineages of the bacteriophage phiX174.  Numerous mutations occurred in each genome during propagation. Across nine separate lineages 119 independent substitutions occurred at 68 nucleotide sites.  What is interesting here is that over half of these substitutions at 1/3 of the sites were identical in the different lineages. Some convergent substitutions were specific to specific hosts while others where shared between the two separate hosts.  Phylogenetic reconstruction using the complete genome sequence not only failed to recover the correct evolutionary history because of these convergent changes, but the true history was rejected as being a significantly inferior fit to the data. 27 In a subsequent similar study Bull et al argue that such results "point to a limited number of pathways taken during evolution in these viruses, and also raise the possibility that much of the amino-acid variation in the natural evolution of these viruses has been selected." 29 In other words, much of the variations in viral genomes is not neutral, but is in fact functional and therefore maintained by natural selection. 

  This is amazing!  The implications here are quite stunning.  If the convergent nature of molecular mutations like this cannot be adequately detected such mutations would interfere with any sort of reliable phylogenetic tree building or accurate determination of evolutionary relationships.  If there is any sort of correlation with higher-level multicellular organisms, this could significantly undermine the entire science of evolutionary biology as it is currently understood.  Real time studies like this are obviously needed on a wider scale to determine if such convergent mutations are more widespread.  Obviously, the common assumption that convergent mutations on the molecular level are rare and the result of completely random chance is simply not true anymore for at least some (and possibly most if not all) genomes.  

A similar finding was described more recently by Cuevas et al. in a 2002 article published in Genetics dealing with RNA viruses (see Addendum).28  In this study the authors again demonstrated convergences in 12 variable sites in independent lineages.  The authors were surprised to discover that convergences occurred not only within non-synonymous sites, but in synonymous sites and intergenetic regions as well (usually thought to be neutral with respect to the effects of natural selection).  The authors also noted that this phenomenon is not restricted to the laboratory, but is also a relatively widespread observation among HIV-1 virus clones in humans and in SHIV strains isolated from macaques, monkeys, and humans.  

  These same authors go on to note that, "Convergent evolution at the molecular level is not controversial as long as it can be reconciled with the neutralist and the selectionist theories. The neutral theory suggests that convergences are simply accidents, whereas within the framework of selectionism, there are two qualifications for convergences.  The first explanation considers convergences as being adaptive and the result of organisms facing the same environment (as in the case of our experiments) with a few alternative pathways of adaptation (as expected for compacted genomes).  Second, keeping in mind the model of clonal interference, beneficial mutations have to become fixed in an orderly way (Gerrish and Lenski 1998), with the best possible candidate fixed first, and then the second best candidate, and so on.  This implies that, given a large enough population size to make clonal interference an important evolutionary factor, we should always expect the same mutations to be fixed."

  According to the authors, the the above argument is valid for nonsynonymous changes but an alternative explanation must be found for synonymous changes and for changes in the intergenic regions since these changes are generally though to be selectively neutral. So, the authors note that, "Genomic RNA is involved in many RNA-RNA and RNA-protein interactions that affect viral replication. This is obvious for noncoding, regulatory regions (Stillman and Whitt 1997, 1998), but there is increasing evidence that capsid-coding regions in picornaviruses may also have an effect on viral replication (McKnight and Lemon 1998; Fares et al. 2001). Therefore, the RNA itself (apart from its protein-coding capacity) may contribute to the viral phenotype, and fitness may also be affected by synonymous replacements."  This is an important point because, "Evidence for selection on synonymous sites has been inferred also in mammals (Eyre-Walker 1999), as a consequence of selection acting upon the base composition of isochors and large sections of junk DNA." 

  In other words, there doesn't seem to be much DNA, even in seemingly non-functional areas of DNA or even among synonymous changes, that is truly non-functional when it comes to viral genomes.  The authors then go on to suggest a comparison with the genomes of high-level organism, like hominids.  

  "For example, Fay et al (2001) reported that, in humans, the vast majority (80%) of amino acidic changes are deleterious to some extent and only a minor fraction are neutral.  Among these deleterious amino acidic mutations, at least 20% are slightly deleterious.  Here, we found that 15 amino acid sites changed, with only 5 being significantly advantageous. At this point, we can only speculate about the selective role of all the amino acid sites shown to be invariable in our study.  The total number of amino acids in five genes of VSV is 3536.  Assuming that changes in any of the 3536 - 15 - 3521 invariable amino acids would be deleterious (and thus washed out by purifying selection during or evolution experiment), then the fraction of amino acid replacements that are potentially harmful would be 3521/3536 = ~99.58%; the fraction of neutral sites would be 10/3536 = ~0.28%; whereas only 5/3536 = ~0.14% would be beneficial.  Despite the differences between humans and VSV in genome size and organization and in the nature of the nucleic acid used, in both cases the fraction of potentially deleterious amino acid substitutions is overwhelmingly larger than that of neutral or beneficial ones." 

  In other words, it is at least reasonable to suspect that very little coding DNA, even in hominids, is truly "neutral" or immune to all pressures of natural selection.  This is becoming true of non-coding DNA as well given that much of what was once thought to be junk is now being found to be functional ( Link ). This strongly suggests that many of what were thought to be shared mutational errors might actually be functionally-maintained by similar creatures in similar environments.  In this light, consider the following conclusions of Wood et al published in a 2005 edition of Genetica:

 

 

The most convincing evidence of parallel genotypic adaptation comes from artificial selection experiments involving microbial populations. In some experiments, up to half of the nucleotide substitutions found in independent lineages under uniform selection are the same. Phylogenetic studies provide a means for studying parallel genotypic adaptation in non-experimental systems, but conclusive evidence may be difficult to obtain because homoplasy can arise for other reasons. Nonetheless, phylogenetic approaches have provided evidence of parallel genotypic adaptation across all taxonomic levels, not just microbes. Quantitative genetic approaches also suggest parallel genotypic evolution across both closely and distantly related taxa, but it is important to note that this approach cannot distinguish between parallel changes at homologous loci versus convergent changes at closely linked non-homologous loci. The finding that parallel genotypic adaptation appears to be frequent and occurs at all taxonomic levels has important implications for phylogenetic and evolutionary studies. With respect to phylogenetic analyses, parallel genotypic changes, if common, may result in faulty estimates of phylogenetic relationships. [Emphasis added] 30

 

 

Notice that according to Wood et al, parallel and/or convergent mutations are "frequent" at "all taxonomic levels, not just microbes".  That's very interesting and does indeed have very serious implications when it comes to determining phylogenetic relationships - relationships that are likely to be not only wrong, but meaningless as far as the evolutionary theory of common descent is concerned.  Rather, phylogenetic similarities may be more a reflection of functional similarities and differences than of true evolutionary relationships.  

 

 

 

 

Mutational Hotspots

 

 

Back to mutational hotspots, what makes hotspots so "hot"?  Perhaps the answer lies in the chemical nature of the hotspot region.  The type of molecular bonds, their stability or instability, or other molecular interactions may lend themselves to specific nucleotide pair switches, especially given certain environmental changes.  No one really knows for sure except to say that mutational hot spots do exist.  So, given that they do exist, similar genes should be expected to function in similar ways and this includes having similar mutational "hotspots and/or "shared mistakes." 3   In any case, it is interesting to note that there are no such examples of "shared errors" between mammals and other groups of animals (although there are plenty of common "errors" that are shared by widely divergent mammalian groups). 

 

There are no examples of 'shared errors' that link mammals to other branches of the genealogic tree of life on earth. . . Therefore, the evolutionary relationships between distant branches on the evolutionary genealogic tree must rest on other evidence besides 'shared errors.' 11

 

Of course the argument used to explain this fact is that mammals split off from other groups of animals over 200 million years ago.  Given this amount of time, random mutations would have obliterated any trace of common genetic errors. 11   This is a very good point.  The question remains however as to why are some identifiable genetic errors are maintained as long as they are if they are in fact functionless?  Also, "processed pseudogenes" are very similar to "movable genetic elements" which are often transmitted from animal to animal by viruses.  Certain interspecies pseudogenes of this type might in fact share a common ancestor while the various types of animals themselves, that harbor certain of these genetic sequences, may not be related through common descent so much as they are partially related through common infection.

In any case, there really are no "foolproof" genetic markers of common decent.  All of the ones proposed so far to be foolproof have been shown to have significant flaws.  The prediction that pseudogenes, transposons (SINEs and LINEs) and other shared mutational mistakes are conclusive evidence for common descent has not held up over recent years. For example, consider the following excerpt from David Hillis' paper entitled, "SINEs of the perfect character." published in  the Proceedings of the National Academy of Sciences, 1999:

 

  What of the claim that the SINE/LINE insertion events are perfect markers of evolution (i.e., they exhibit no homoplasy)?  Similar claims have been made for other kinds of data in the past, and in every case examples have been found to refute the claim.  For instance, DNA-DNA hybridization data were once purported to be immune from convergence, but many sources of convergence have been discovered for this technique.  Structural rearrangements of genomes were thought to be such complex events that convergence was highly unlikely, but now several examples of convergence in genome rearrangements have been discovered.  Even simple insertions and deletions within coding regions have been considered to be unlikely to be homoplastic, but numerous examples of convergence and parallelism of these events are now known.  Although individual nucleotides and amino acids are widely acknowledged to exhibit homoplasy, some authors have suggested that widespread simultaneous convergence in many nucleotides is virtually impossible. Nonetheless, examples of such convergence have been demonstrated in experimental evolution studies. 10

 

 

A New Paradigm

 

       Obviously then, the old notions that pseudogenes and other forms of shared "junk" DNA give clear evidence of common ancestry over common functional need, will have to be discarded.  Certainly if organisms share similar environments and have similar morphologic appearances and needs, should one be surprised to find similar functional genetic elements shared between such creatures?   Such sequences cannot be used to clearly establish evolutionary trees and to estimate divergent times since such beneficial sequences would be maintained over time via natural selection without any significant changes.  The similarities and differences would not be based so much on evolutionary changes over the time since a shared common ancestor as they would be the result of similarities and differences in functional needs that have always been there, maintained by the forces of natural selection, since these creatures came to be.

 

        No one knows yet just what the big picture of genetics will look like once this hidden layer of information is made visible. "Indeed, what was damned as junk because it was not understood may, in fact, turn out to be the very basis of human complexity," Mattick suggests. Pseudogenes, riboswitches and all the rest aside, there is a good reason to suspect that is true. Active RNA, it is now coming out, helps to control the large-scale structure of the chromosomes and some crucial chemical modifications to them—an entirely different, epigenetic layer of information in the genome.16

       In fact, the most detailed probe yet into the workings of the human genome has led scientists to conclude [as of June 14, 2007] that a cornerstone concept about the chemical code for life is badly flawed.  Reporting in the British journal Nature and the US journal Genome Research on Thursday [June 14, 2007], they suggest that an established theory about the genome should be consigned to history.

        In between the genes and the sequences known to regulate their activity are long, tedious stretches that appear to do nothing. The term for them is "junk" DNA, reflecting the presumption that they are merely driftwood from our evolutionary past and have no biological function. But the work by the ENCODE (ENCyclopaedia of DNA Elements) consortium implies that this nuggets-and-dross concept of DNA should be, well, junked.

        The genome turns out to a highly complex, interwoven machine with very few inactive stretches, the researchers report. Genes, it transpires, are just one of many types of DNA sequences that have a functional role. And "junk" DNA turns out to have an essential role in regulating the protein-making business. Previously written off as silent, it emerges as a singer with its own discreet voice, part of a vast, interacting molecular choir. 

        "The majority of the genome is copied, or transcribed, into RNA, which is the active molecule in our cells, relaying information from the archival DNA to the cellular machinery," said Tim Hubbard of the Wellcome Trust Sanger Institute, a British research group that was part of the team. "This is a remarkable finding, since most prior research suggested only a fraction of the genome was transcribed."

        Francis Collins, director of the US National Human Genome Research Institute (NHGRI), which coralled 35 scientific groups from around the world into the ENCODE project, said the scientific community "will need to rethink some long-held views about what genes are and what they do."17

           

 

The human genome in numbers26

 

  • 1.5% of the genome translated into proteins
  • 27% of the genome transcribed as part of protein-coding gene expression but not translated into proteins
  • 25% of the genome that is transcribed but not translated, and is not associated with protein-coding genes
  • 250 microRNAs currently identified (as of June 2005) 
    • ~1,000 as of 2007 ( Link )
  • 10,000 protein-coded genes estimated to be regulated by microRNAs; each microRNA can target several genes, and a particular gene may be regulated by several microRNAs
  • 98% of genomic output that is non-coding RNA
  • 9% of genes that appear to have associated antisense transcripts
  • ~20,000 "pseudogenes" in the genome

 

 

        This is very interesting.  I mean, who would have thought that the majority of the genome would be copied or transcribed into RNA? - and that it would in fact be functional?  Only a few years ago the scientific community believed that less than 5% of the genome was actually functional and the rest was non-functional evolutionary remnants.  After all, "noncoding genomic regions account for 98% to 99% of the human genome and consist of introns found within protein-coding transcripts and the intergenic regions between them."25  Add to these numbers the very surprising finding that many genetic sequences that do not produce either proteins or RNA are also being found to be functional (see discussion of Pyknons)

       Who would have predicted this? - - besides creationists and intelligent design theorists that is?  Creationists and intelligent design theorists have been claiming for many years that the concept of "Junk DNA" (as well as vestigial structures) was not entirely correct. I myself have been promoting this idea for over 11 years (as of June, 2008).  Yet, only now are mainstream scientists finally starting to realize the significant errors in their long-cherished beliefs when it comes to the ill-conceived notion of junk DNA - an idea which was based on ardently held evolutionary presuppositions that blinded mainstream science and prevented them from searching out the hidden treasures of so-called "junk DNA" for a fairly long time. 

       When are scientists going to start realizing that the creationist paradigm does indeed have very good predictive scientific value when it comes to accurately understanding and investigating the physical world and universe?

 

 

 

Pyknons

 

 

       To add to this, consider the fairly recent finding (2006) of "pyknons" by Rigoutsos et al.24 Pyknons are variable-length patterns within DNA sequences that have identically conserved copies and multiplicities above what is expected by chance. They are also no transcribed into RNA (unlike miRNAs noted above) or translated into protein. Among the millions of discovered patterns, Rigoutsos et al. found a subset of 127,998 patterns, which they termed pyknons, that have additional nonoverlapping instances in the untranslated and protein-coding regions of 30,675 transcripts from 20,059 human genes. The pyknons arrange combinatorially in the untranslated and coding regions of numerous human genes where they form mosaics. Consecutive instances of pyknons in these regions show a strong bias in their relative placement, favoring distances of ~22 nucleotides.

       Pyknons are also very common in the human genome.  They form 1/6th of the human intergenic and intronic regions for a total of 127,998 pyknons covering 898,424,004 DNA nucleotide positions on the forward and reverse strands of the human genome. 

       What is interesting here, of course, is that pyknons are associated with specific biologic processes - i.e., they are functional. Cross-genome comparisons reveal that many of the pyknons have instances in the 3' UTRs of genes from other vertebrates and invertebrates where they are overrepresented in similar biological processes, as in the human genome. This "unexpected finding" suggests, according to the authors, potential unique functional connections between the coding and noncoding parts of the human genome - such as a possible link with posttranscriptional gene silencing and RNA interference. 

 

 

     "Human pyknons are also present in other genomes, where they associate with similar biological processes.  Notably, >600 million nucleotides that are associated with nongenic copies of pyknons in the human genome are absent from the mouse and rat genomes. Interestingly, the human pyknons have many instances in the intergenic and intronic regions of the phylogenetically distant worm and fruit fly genomes, covering ~1.6 million nucleotides in each."24

 

 

     Given that genetic sequences that are transcribed or translated or both seem to account for the "majority" of the genome, and are thought to be functionally beneficial, it is interesting that certain types of genetic sequences that are neither translated nor transcribed are also being found to be functional.  Taken together, it seems like the significant majority of the genome is indeed functional to at least some degree - well over 50% if not more like 85-90% or even higher?  

 

 

 

The Key Human-Ape Differences

 

       It is becoming more and more clear that the key functional differences between living things, like humans and apes, are not so much found in protein-coding genes, but in the non-coding regions of DNA once thought to be functionless "junk-DNA" - evolutionary remnants of past mistakes that are shared between various creatures.  This notion is starting to be shed with more and more discoveries that show that many of these same regions are not just functional, they carry the vast majority of the genetic information.  The "genes" that were once thought to be so important for genetic function are turning out to be equivalent to the most low-level basic building blocks within the genome, like bricks and motor.  Surprisingly, it is the non-coding regions of DNA control what is done with these building blocks - that determine what kind of "house" to build so to speak.  The following article is very interesting in this regard:

 

 

     "Seventy-five percent of known human miRNAs [microRNAs] cloned in this study were conserved in vertebrates and mammals, 14% were conserved in invertebrates, 10% were primate specific and 1% are human specific. The new miRNAs have a different conservation distribution: more than half of the human miRNAs were conserved only in primates, about 30% in mammals and 9% in nonmammalian vertebrates or invertebrates; 8% were specific to humans. We saw a similar distribution for the chimpanzee miRNAs.

     The different miRNA repertoire, as well as differences in expression levels of conserved miRNAs, may contribute to gene expression differences observed in human and chimpanzee brain . Although the physiological relevance of miRNAs expressed at low levels remains to be shown, it is tempting to speculate that a pool of such miRNAs may contribute to the diversity of developmental programs and cellular processes . . . For example, miRNAs recently have been implicated in synaptic development and in memory formation. As the species specific miRNAs described here are expressed in the brain, which is the most complex tissue in the human body, with an estimated 10,000 different cell types, these miRNAs could have a role in establishing or maintaining cellular diversity and could thereby contribute to the differences in human and chimpanzee brain ... function." 23

     

     Pseudogenes are also being found to have similar functionality as miRNAs.  "Transcripts of processed pseudogenes can contain regions with significant antisense homology, which may suggest a regulatory role for transcribed pseudogenes through an RNAi-like mechanism" (see Link ).  Two recent studies have demonstrated that such transcribed pseudogenes can regulate transcription of homologous protein-coding genes. Transcription of a pseudogene in Lymnea stagnalis, that is homologous to the nitric oxide synthase gene, decreases the expression levels for the gene through formation of a RNA duplex; this is thought to arise via a reverse-complement sequence found at the 5′ end of the pseudogene transcript (Link). In a second example, transcription of the makorin1-p1 TPΨg in mouse was required for the stability of the mRNA from a homologous gene makorin1. This regulation was deduced to arise from an element in the 5′ areas of both the gene and the pseudogene (Link).  More recently, Weil et al. discovered that the murine FGFR-3 pseudogene is transcribed in fetal tissues in an antisense direction. This prompted the following consideration:

 

     'As the regions of exact identity between FGFR-3 and its pseudogene can be up to 60 nt long, it may be envisioned that FGFR-3 transcripts could play a regulatory role in FGFR-3 expression. If these antisense transcripts could hybridize to sense FGFR-3 transcripts inside the cells, this may lead to either rapid degradation or inhibition of translation.' (Link)

 

     As Yao et. al., predict, "Further studies on transcribed pseudogenes will add to our understanding of their potential roles as non-coding RNA genes or other new types of functional elements." (Link)  It seems like many transcribed pseudogenes may act as giant miRNAs to regulate the function of protein-coding genes and other genetic elements.

 

 

  Additional information dealing with this most interesting topic is listed in an fairly extensive essay by Wade Schauer (used with permission).

   

Addendum:

 

Jose´ M. Cuevas, Santiago F. Elena and Andre´s Moya, Molecular Basis of Adaptive Convergence in Experimental Populations of RNA Viruses, Genetics, October 2002, 162: 533–542 ( Link ):

 

     Our experiment dealt with the existence of evolutionary convergences.  Evolutionary convergences constitute a very slippery topic, since a result of convergent evolution can always be seen as a cross-contamination by those critical of the existence of convergence.  The only serious way to address evolutionary convergences is to (1) design and run experiments in such a way that physical or temporal coexistence of evolving lineages is minimized and (2) test whether the results can be explained by potential contaminations at different experimental steps.  With our experiments, we took all possible precautions to minimize the risk of cross-contaminations and, in fact, a detailed phylogenetic analysis of our results supports the view that our results are better explained by evolutionary convergences than by a general contamination at different steps. . .

     One of the most amazing features illustrated in Figure 1 is the large amount of evolutionary convergences observed among independent lineages.  Twelve of the variable sites were shared by different lineages.  More surprisingly, convergences also occurred within synonymous sites and intergenic regions.  Evolutionary convergences during the adaptation of viral lineages under identical artificial environmental conditions have been described previously (Bull et al. 1997; Wichman et al. 1999; Fares et al. 2001). However, this phenomenon is observed not only in the laboratory.  It is also a relatively widespread observation among human immunodeficiency virus (HIV)-1 clones isolated from patients treated with different antiviral drugs; parallel changes are frequent, often following a common order of appearance (Larder et al. 1991; Boucher et al. 1992; Kellam et al. 1994; Condra et al. 1996; Martinez-Picado et al. 2000).  Subsequent substitutions may confer increasing levels of drug resistance or, alternatively, may compensate for deleterious pleiotropic effects of earlier mutations (Molla et al. 1996; Martinez-Picado et al. 1999; Nijhuis et al. 1999).  Also, molecular convergences have been observed between chimeric simian-human immunodeficiency viruses (strain SHIV-vpu+) isolated from pig-tailed macaques, rhesus monkeys, and humans after either chronic infections or rapid virus passage (Hofmann-Lehmann et al. 2002). 

     Convergent evolution at the molecular level is not controversial as long as it can be reconciled with the neutralist and the selectionist theories. The neutral theory suggests that convergences are simply accidents, whereas within the framework of selectionism, there are two qualifications for convergences.  The first explanation considers convergences as being adaptive and the result of organisms facing the same environment (as in the case of our experiments) with a few alternative pathways of adaptation (as expected for compacted genomes).  Second, keeping in mind the model of clonal interference, beneficial mutations have to become fixed in an orderly way (Gerrish and Lenski 1998), with the best possible candidate fixed first, and then the second best candidate, and so on.  This implies that, given a large enough population size to make clonal interference an important evolutionary factor, we should always expect the same mutations to be fixed.

     The above argument is valid for nonsynonymous changes but an alternative explanation must be found for synonymous changes and for changes in the intergenic regions.  Genomic RNA is involved in many RNA-RNA and RNA-protein interactions that affect viral replication. This is obvious for noncoding, regulatory regions (Stillman and Whitt 1997, 1998), but there is increasing evidence that capsid-coding regions in picornaviruses may also have an effect on viral replication (McKnight and Lemon 1998; Fares et al. 2001). Therefore, the RNA itself (apart from its protein-coding capacity) may contribute to the viral phenotype, and fitness may also be affected by synonymous replacements.  Evidence for selection on synonymous sites has been inferred also in mammals (Eyre-Walker 1999), as a consequence of selection acting upon the base composition of isochors and large sections of junk DNA. 

     For the sake of illustration, it would be interesting to compare the number of selectively important sites in the VSV genome with those estimated for other genomes.  For example, Fay et al (2001) reported that, in humans, the vast majority (80%) of amino acidic changes are deleterious to some extent and only a minor fraction are neutral.  Among these deleterious amino acidic mutations, at least 20% are slightly deleterious.  Here, we found that 15 amino acid sites changed, with only 5 being significantly advantageous. At this point, we can only speculate about the selective role of all the amino acid sites shown to be invariable in our study.  The total number of amino acids in five genes of VSV is 3536.  Assuming that changes in any of the 3536 - 15 - 3521 invariable amino acids would be deleterious (and thus washed out by purifying selection during or evolution experiment), then the fraction of amino acid replacements that are potentially harmful would be 3521/3536 = ~99.58%; the fraction of neutral sites would be 10/3536 = ~0.28%; whereas only 5/3536 = ~0.14% would be beneficial.  Despite the differences between humans and VSV in genome size and organization and in the nature of the nucleic acid used, in both cases the fraction of potentially deleterious amino acid substitutions is overwhelmingly larger than that of neutral or beneficial ones. 28

 

 

References:

  1. Jacq C, Miller JR, Brownlee GG. A pseudogene structure in 5S DNA of Xenopus laevis, Cell 12:109-120. 1977.

  2. Gibson L. J., Pseudogenes and Origins, Origins 21(2):91-108. 1994.

  3. Menotti R.M., Starmer W.T., Sullivan D.T., Characterization of the structure and evolution of the Adh region of Drosophila hydei, Genetics 127:355-366. 1991.

  4. Lalley P.A., Davisson M.T., Graves J.A.M., O’Brien S.J., Womack J.E., Roderick T.H., Creau-Goldberg N., Hillyard A.L., Doolittle D.P., Rogers J.A., Report of the committee on comparative mapping, Cytogenetics and Cell Genetics 51:503-532. 1989.

  5. Long M., Langley C.H., Natural selection and the orgin of jingwei, a chimeric processed functional gene in Drosophila,  Science 260:91-95.  1993.

  6. Jerlstrom, Pierre. 2000. Pseudogenes. Creation Ex Nihilo Technical Journal 14 (no. 3):15.  

  7. Woodmorappe, John.2000. Are Pseudogenes 'Shared Mistakes' Between Primate Genomes? Creation Ex Nihilo Technical Journal 14 (no. 3):58-71.

  8. Abate, Tom. 2001. Genome Discovery Shocks Scientists. San Francisco Chronicle (February 11).  

  9. Cantrell, Michael A. and others. 2001. An Ancient Retrovirus-like Element Contains Hot Spots for SINE Insertion. Genetics 158:769-777.

  10. Hillis, David M. 1999. SINEs of the perfect character. Proceedings of the National Academy of Sciences 96:9979-9981.

  11. Max, Edwards. Plagiarized Errors and Molecular Genetics. Creation/Evolution (XIX, p.34) 1986-2003. ( http://www.talkorigins.org/faqs/molgen/ )

  12. Lee, Jeannie T., Complicitiy of the gene and pseudogene, Nature 423:26-28. 2003

  13. Hirotsun, Shinji et. al., An expressed pseudogene regulates the messenger-RNA stability of its homologous coding gene, Nature 423:91-96. 2003

  14. Makalowski, Wojciech. 2003.  Not Junk After All, Science 300:1246-1247

  15. Balakirev, Evgeniy S., Ayala, Francisco J., PSEUDOGENES: Are They "Junk" or Functional DNA? Annual Review of Genetics, Vol. 37, pp. 123-151, December 2003 http://arjournals.annualreviews.org/doi/abs/10.1146%2Fannurev.genet.37.040103.103949 )   

  16. Wyatt Gibbs, The Unseen Genome: Gems among the Junk, Scientific American, November 2003, pp 45-53 ( Link )

  17. ENCORE Project Consortium et al., Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project, Nature 447, 799-816 (14 June 2007); Richard Ingham, Landmark study prompts rethink of genetic code, Yahoo News, accessed June 15, 2007 (Link1, Link2)

  18. Nishikimi, M. and Yagi, K. (1991) Molecular basis for the deficiency in humans of gulonolactone oxidase, a key enzyme for ascorbic acid biosynthesis. Am. J. Clin. Nutr. 54(6 Suppl):1203S-1208S.

  19. Nishikimi, M., Fukuyama, R., Minoshima, S., Shimizu, N. and Yagi. K. (1994) Cloning and chromosomal mapping of the human nonfunctional gene for L-gulono-gamma-lactone oxidase, the enzyme for L-ascorbic acid biosynthesis missing in man. J. Biol. Chem. 269:13685-13688.

  20. Ohta, Y. and Nishikimi, M. (1999) Random nucleotide substitutions in primate nonfunctional gene for L-gulono-gamma-lactone oxidase, the missing enzyme in L-ascorbic acid biosynthesis. Biochim. Biophys. Acta. 1472:408-411.

  21. Inai, Y., Ohta. Y., and Nishikimi, M. (2003) The whole structure of the human nonfunctional L-gulono-gamma-lactone oxidase gene--the gene responsible for scurvy--and the evolution of repetitive sequences thereon. J Nutr Sci Vitaminol (Tokyo) 49:315-319.

  22. Peter Borger, Shared mutations: Common descent or common mechanism?, The Independent Research Institute on Origins, Accessed 8/10/07 ( Link )

  23. Eugene Berezikov, Fritz Thuemmler, Linda W van Laake, Ivanela Kondova, Ronald Bontrop4, Edwin Cuppen & Ronald H A Plasterk, "Diversity of microRNAs in human and chimpanzee brain", Nature Genetics, Vol 38 | Number 12 | December 2006 pp. 1375-1377. ( Link )

  24. Isidore Rigoutsos, Tien Huynh, Kevin Miranda, Aristotelis Tsirigos, Alice McHardy, and Daniel Platt, Short blocks from the noncoding parts of the human genome have instances within nearly all known genes and relate to biological processes, PNAS | April 25, 2006 | vol. 103 | no. 17 | 6605-6610 ( Link )

  25. Jill Cheng, Philipp Kapranov, Jorg Drenkow, Sujit Dike, Shane Brubaker, Sandeep Patel, Jeffrey Long, David Stern, Hari Tammana,  Gregg Helt, Victor Sementchenko, Antonio Piccolboni, Stefan Bekiranov, Dione K. Bailey, Madhavan Ganesh, Srinka Ghosh, Ian Bell,1 Daniela S. Gerhard, Thomas R. Gingeras, Transcriptional Maps of 10 Human Chromosomes at 5-Nucleotide Resolution, Science 20 May 2005: Vol. 308. no. 5725, pp. 1149 - 1154 ( Link )

  26. Richard Twyman, Small RNA: BIG NEWS, The Human Genome, January 2005 ( Link )

  27. J. J. Bull, M. R. Badgett, H. A. Wichman, J. P. Huelsenbeck, D. M. Hillis, A. Gulati, C. Ho, and I. J. Molineux, Exceptional Convergent Evolution in a Virus, Genetics, 1997 December; 147(4): 1497–1507. ( Link )

  28. Jose´ M. Cuevas, Santiago F. Elena and Andre´s Moya, Molecular Basis of Adaptive Convergence in Experimental Populations of RNA Viruses, Genetics, October 2002, 162: 533–542 ( Link )

  29. H A Wichman, L A Scott, C D Yarber, and J J Bull, Experimental evolution recapitulates natural evolution, Philos Trans R Soc Lond B Biol Sci. 2000 November 29; 355(1403): 1677–1684. ( Link )

  30. Troy E. Wood, John M. Burke and Loren H. Rieseberg, Parallel genotypic adaptation: when evolution repeats itself, Genetica, February 2005, Volume 123, Numbers 1-2, pp. 157-170 ( Link )

  31. F. Flam, Hints of a language in junk DNA, Science 266:1320, 1994.

  32. Haussler and Gill Bejerano, Junk DNA, May 6, 2004 online version of Science. ( Link )

  33. Stanford University Medical Center (2007, April 24). 'Junk' DNA Now Looks Like Powerful Regulator, Scientists Find. ScienceDaily. ( Link )

 

 

 

. Home Page                                                                           . Truth, the Scientific Method, and Evolution   

. Methinks it is Like a Weasel                                                 . The Cat and the Hat - The Evolution of Code   

. Maquiziliducks - The Language of Evolution             . Defining Evolution    

. The God of the Gaps                                                           . Rube Goldberg Machines  

. Evolving the Irreducible                                                     . Gregor Mendel  

. Natural Selection                                                                  . Computer Evolution  

. The Chicken or the Egg                                                         . Antibiotic Resistance  

. The Immune System                                                            . Pseudogenes  

. Genetic Phylogeny                                                                . Fossils and DNA  

. DNA Mutation Rates                                                            . Donkeys, Horses, Mules and Evolution  

. The Fossil Record                                                                . The Geologic Column  

.  Early Man                                                                                . The Human Eye  

. Carbon 14 and Tree Ring Dating                                     . Radiometric Dating  

 . Amino Acid Racemization Dating                   . The Steppingstone Problem

.  Quotes from Scientists                                                           . Ancient Ice

 . Meaningful Information                                                          . The Flagellum

 . Harlen Bretz                                   . Milankovitch Cycles

 . Kenneth Miller's Best Arguments


 

 



Search this site or the web powered by FreeFind

Site search Web search

 

 

 

 

 

 

 

Since June 1, 2002

 

 

 

 

 

 

 

AddFreeStats AddFreeStats