Showing posts with label proteins. Show all posts
Showing posts with label proteins. Show all posts

Friday, February 3, 2023

Chapter 6: How Many Genes? How Many Proteins?

Introduction
I think there are about 25,000 genes in the human genome but the annotated human genome says there are 45,000 and many scientists claim there are a lot more genes. Why is there a controversy over the number of genes? (pp. 136-137)
Defining a gene
It's important to have a usuable definition of a gene. I define a gene as a DNA sequence that's transcribed to produce a functional product. The important point is that the gene product (RNA or protein) must have a biological function. (pp. 137-138)
[Dan Graur proposes a new definition of "gene"] [Gerald Fink promotes a new definition of a gene]
The molecular gene and the Mendelian gene
I'm talking about the molecular gene. The Mendelian gene is used in genetics and it's similar to the definition Richard Dawkins uses in his book The Selfish Gene. (pp. 138-139)
Counting genes
Draft sequences of genomes always contain predictions of large numbers of genes that are subsequently eliminated by annotators as more information becomes available. The current best estimates are that there are somewhat fewer than 20,000 protein-coding genes. (pp. 139-142))
[The 20th anniversary of the human genome sequence: 3. How many genes?] [How many protein-coding genes in the human genome? (2)] [How many protein-coding genes in the human genome?]
Counting proteins
The latest count is 18,407 proteins detected and 1,343 probable proteins that haven't yet been found for a total of 19,750. (pp. 142-143)
[How many proteins in the human proteome?]
The functions of protein-coding genes
There are about 10,000 housekeeping genes that encode the proteins required for basic metabolic processes. (pp. 143-144)
Historical estimates of the number of genes
Historical estimates predicted that the human genome would have about 30,000 genes and those estimates turned out to be approximately correct. Guesstimates about larger numbers of genes (e.g. 100,000) were not based on facts. (pp. 144-146)
[False history and the number of genes: 2016]
Confusion about the number of genes
The popular press claimed that knowledgeable scientists were predicting 100,000 genes but that's not correct. (p. 147)
[Nature falls (again) for gene hype]
The Deflated Ego Problem
Many scientists don't believe that humans could only have the same number of genes as nematodes and flowering plants. I call this The Deflated Ego Problem. (pp. 147-149)
[Deflated egos and the G-value paradox] [Revisiting the deflated ego problem] [The Deflated Ego Problem]
Introns and the size of genes
A typical protein-coding gene is 61,700 bp long but most of this is introns. Coding regions occupy about 1% of the genome and introns take up 37%. Genes account for 45% of the genome when you add in the noncoding genes. This number is not widely reported in the popular press. (pp. 149-151)
Introns are mostly junk
The weight of evidence strongly favors the view that most of the DNA in introns is junk. The splice sites and the minumum amount of DNA required to form a loop suggest that only 50 bp in each intron is functional DNA. (pp. 151-152)
[Are introns mostly junk?] [Are splice variants functional or noise?]
   Box: Yeast loses its introns
Yeast has lost most of its introns since it diverged from other fungi. Most of the rest can be deleted without causing any decrease in fitness but a few seem to be essential. More that 98% of the introns in yeast are dispensible, confirming the idea that introns are mostly junk. (pp. 153-154)
[Yeast loses its introns]
Alternative splicing: common or rare?
One way to solve the Deflated Ego Problem is to assume that human genes can make many different proteins by an alternative splicing mechanism. There are many real examples of biologically relevant alternative splicing. (pp. 154-156)
[Debating alternative splicing (Part I)] [Debating alternative splicing (Part II)] [Debating alternative splicing (Part III)] [Debating alternative splicing (Part IV)]
How does alternative splicing work?
Biologically relevant alternative splicing occurs when splicing factors alter the activity of the spliceosome. Splicing errors are common and mispliced transcripts (junk RNA) are easily detectable and entered into the transcript databases. (pp. 156-160)
Splicing errors are the best explanation
It's relatively easy to identify most splicing errors and eliminate those transcripts from the annotated reference genome. The vast majority of splice variants fall into the splicing errors category. (pp. 160-163)
[Splicing errors or alternative splicing?] [Alternative splicing and evolution] [Using conservation to determine whether splice variants are functional] [Splice variants of the human triose phosphate isomerase gene: is alternative splicing real?]
The case for splicing errors
There are 4 good reasons for concluding that true alternative splicing is confined to less than 5% of human protein-coding genes. (pp. 163)
[The frequency of splicing errors reflects the balance between selection and drift]
The controversy and how it’s reported
The controversy over the abundance of real alternative splicing is mostly ignored in the scientific literature and in the popular press. It is widely assumed that almost all human genes are alternatively spliced. (p. 164-165)
[Alternative splicing: function vs noise] [The persistent myth of alternative splicing] [The textbook view of alternative splicing] [The proteome complexity myth]
   Box: The false logic of the argument for complexity
If alternative splicing is going to solve the Defalted Ego Problem then it must distinguish humans from other species. But all species produce abundant transcripts due to splicing errors so humans are no different than nematodes or flowering plants. (pp. 166-167)
[Alternative splicing in the nematode C. elegans]
Alternative splicing and disease
Genetic diseases can be caused by errors in splicing. Their widespread occurance is taken to be proof that alternative splicing is ubiquitous, but disease-causing splice errors can also occur in junk DNA. (pp. 167-169)
Notes for Chapter 6 (pp. 324-327)