Stanford Encyclopedia of Philosophy
This is a file in the archives of the Stanford Encyclopedia of Philosophy.

Molecular Biology

First published Sat Feb 19, 2005; substantive revision Wed Sep 9, 2009

The field of molecular biology studies macromolecules and the macromolecular mechanisms found in living things, such as the molecular nature of the gene and its mechanisms of gene replication, mutation, and expression. Given the fundamental importance of these macromolecular mechanisms throughout the history of molecular biology, it will be argued that a philosophical focus on the concept of a mechanism generates the clearest picture of molecular biology's history, concepts, and case studies utilized by philosophers of science.

This encyclopedia entry is organized around these three themes. First, a historical overview of the developments in molecular biology from its origins to the present pays special attention to the features of this history referenced by philosophers. Philosophical analysis then turns to the key concepts in the field: mechanism, information, and the gene. Finally, philosophers have used molecular biology as a case study to address more general issues in the philosophy of science, such as theory reduction and scientific explanation, each of which are understood most clearly in molecular biology with a focus on the field's attention to mechanisms.


1. History of Molecular Biology

Despite its prominence in the contemporary life sciences, molecular biology is a relatively young discipline, originating in the 1930s and 1940s, and becoming institutionalized in the 1950s and 1960s. It should not be surprising, then, that many of the philosophical issues in molecular biology are closely intertwined with this recent history. This section sketches four facets of molecular biology's development: its origins, its classical problems, its subsequent migration into other biological domains, and its more recent turn to genomics and post-genomics. The rich historiography of molecular biology can only be briefly utilized in this shortened history. (See, for example, Abir-Am 1985, 1987, 1994, 2006; Burian 1993; de Chadarevian 2002, 2003; de Chadarevian and Gaudilliere 1996; de Chadarevian and Strasser 2002; Holmes 2001; Judson 1980, 1996; Kay 1993; Morange 1998; Olby 1979, 1990, 1994, 2003; Rheinberger 1997; Sapp 1992; Sarkar 1996a; Witkowski 2005; Zallen 1996. Also see autobiographical accounts by biologists, such as Brenner 2001; Cohen 1984; Crick 1988; Echols 2001; Jacob 1988; Kornberg 1989; Luria 1984; Watson 1968, 2002, 2007; Wilkins 2003.)

1.1 Origins

The field of molecular biology arose from the convergence of work by geneticists, physicists, and structural chemists on a common problem: the structure and function of the gene. In the early twentieth century, although the nascent field of genetics was guided by Mendel's law of segregation (two alleles of a gene separate, i.e., segregate, during the formation of the germ cells so that each germ cell has one but not the other) and law of independent assortment (genes in different linkage groups assort independently in the formation of germ cells), the actual mechanisms of gene reproduction, mutation and expression remained unknown. Thomas Hunt Morgan and his colleagues utilized the fruit fly, Drosophila, as a model organism to study the relationship between the gene and the chromosomes in the hereditary process (Morgan 1926; discussed in Darden 1991; Darden and Maull 1977; Kohler 1994; Roll-Hanson 1978; Wimsatt 1992). A former student of Morgan's, Hermann J. Muller, recognized the “gene as a basis of life,” and so set out to investigate its structure (Muller 1926). Muller discovered the mutagenic effect of X-rays on Drosophila, and utilized this phenomenon as a tool to explore the size and nature of the gene (Carlson 1966, 1971, 1981; Crow 1992; Muller 1927). But despite the power of mutagenesis, Muller recognized that, as a geneticist, he was limited in the extent to which he could explicate the more fundamental properties of genes and their actions. He concluded a 1936 essay: “The geneticist himself is helpless to analyse these properties further. Here the physicist, as well as the chemist, must step in. Who will volunteer to do so?” (Muller 1936, 214)

Muller's request did not go unanswered. The next decade saw several famous physicists turn their attention to the biological question of inheritance (Keller 1990; Kendrew 1967). In What is Life?, the physicist Erwin Schroedinger (1944) proposed ways in which the principles of quantum physics might account for the stability, yet mutability, of the gene (see the entry on life). Schroedinger speculated that the gene might be a kind of irregular “aperiodic crystal” playing a role in a “hereditary code-script.” This book influenced many younger scientists, physicists as well as biologists, considering new avenues of research (Elitzur 1995; Moore 1989; Olby 1994, 240–247; Sarkar 1991; for a reinterpretation see Kay 2000, 59–66).

A more substantive impact came from the migration of Max Delbrueck into biology. Delbrueck became interested in the physical basis of heredity after hearing a lecture by his teacher, quantum physicist Niels Bohr (1933), which expounded a principle of complementarity between physics and biology (McKaughan 2005; Roll-Hansen 2000). In contrast to Schroedinger, Bohr (and subsequently Delbrueck) did not seek to reduce biology to physics; instead, the goal was to understand how each discipline complemented the other (Delbrueck 1949). Delbrueck, with this framework in mind, visited Morgan's fly lab in 1937. But rather than turning his attention to Drosophila, Delbrueck considered even the fruit fly too complex to unravel the unique characteristic of life: self-reproduction. Delbrueck chose to use bacteriophage, viruses that infect bacteria and then multiply very rapidly. The establishment of “The Phage Group” in the early 1940s by Delbrueck and another physicist-turned-biologist Salvador Luria marked a critical point in the rise of molecular biology (Brock 1990; Cairns et al. 1966; Fischer and Lipson 1988; Fleming 1968; Lewontin 1968; Luria 1984; Morange 1998, Ch. 4; Stent 1968). A famous phage experiment by Alfred Hershey and Martha Chase (1952) tracked the chemical components of phage as they entered bacteria. The results provided evidence, adding to that of Oswald Avery's earlier work on bacteria (Avery et al. 1944), that genes were not proteins but deoxyribonucleic acid (DNA).

While Delbrueck facilitated the collaboration between physicists and biologists, he was largely dismissive of the chemical details that bridged these fields. This was in contrast to Delbrueck's colleague at Cal Tech, Linus Pauling, who utilized his knowledge of structural chemistry to study macromolecular structure. Pauling did both theoretical and experimental work important in the subsequent development of molecular biology. His theoretical work on the nature of the chemical bond supplied an understanding of how large macromolecules could be stable (Pauling 1939). The concept of stable macromolecules, encompassing both proteins and nucleic acids, was a necessary prerequisite to the study of their structure (Olby 1979). Pauling, in contrast to biochemists, investigated weak forms of bonding, such as hydrogen bonding. These weaker forms of bonding were later discovered to play important roles in the structures and functions of proteins and nucleic acids (Crick 1996; Sarkar 1998). Pauling's lab at Cal Tech also made use of the technique of x-ray crystallography. It provided a means to investigate molecular structure. X-rays bombarding a molecule left unique images on photographic plates due to the diffraction of the x-rays by the molecule. Combining this powerful methodology with the building of scale models, Pauling discovered the alpha-helical structure of proteins (Pauling and Corey 1950; Pauling et al. 1951), and eventually set his sights on the structure of DNA (Pauling and Corey 1953; for historical treatments of this research see Hager 1995; Pauling 1970).

Recognizing quite early the importance of these new physical and structural chemical approaches to biology, Warren Weaver, then the director of the Natural Sciences section of the Rockefeller Foundation, introduced the term “molecular biology” in a 1938 report to the Foundation. Weaver wrote,

And gradually there is coming into being a new branch of science—molecular biology—which is beginning to uncover many secrets concerning the ultimate units of the living cell….in which delicate modern techniques are being used to investigate ever more minute details of certain life processes. (quoted in Olby 1994, 442)

But perhaps a more telling account of the term's origin came from Francis Crick's explanation for why he began calling himself a molecular biologist: “I myself was forced to call myself a molecular biologist because when inquiring clergymen asked me what I did, I got tired of explaining that I was a mixture of crystallographer, biophysicist, biochemist, and geneticist, an explanation which in any case they found too hard to grasp” (quoted in Stent 1969, 36).

Crick mentioned he was, in part, a biochemist. Likewise, Michel Morange (1998) said, “Molecular biology is a result of the encounter between genetics and biochemistry, two branches of biology that developed at the beginning of the twentieth century.” Both molecular biologists and biochemists did (and continue to) work at the same size level and investigate some of the same mechanisms, such as protein synthesis. However, the two fields had different historical trajectories.

The early history of the two fields may be somewhat simplistically divided according to Aristotle's two features of life: biochemistry was concerned with nutrition (recharacterized as metabolism more generally) and molecular biology (along with its more direct predecessor classical genetics) investigated reproduction. In contrast to molecular biology, biochemistry emerged as a field earlier in the twentieth century. It traced its roots to animal chemistry and medical chemistry of the nineteenth century (Kohler 1982). Much of biochemistry's focus (from the perspective of what is important for molecular biology's questions about the genetic material) was on proteins and enzymes. The gene, usually of little concern to biochemists, was thought to be a protein until evidence in favor of DNA began to emerge in the 1940s and 50s. In biochemical textbooks prior to 1953, nucleic acids (DNA and ribonucleic acid, RNA) were relegated to a minor chapter. The discoveries of the twenty-some amino acids, the building blocks of proteins, were major achievements of early twentieth century biochemistry. Items of interest to biochemists were covalent bonding (a strong form of chemical bonding, that connects amino acids in proteins), the action of enzymes (proteins that act as catalysts in biochemical reactions), and the energy requirements for reactions to occur. After Watson and Crick's (1953a) discovery of the structure of DNA, biochemistry showed increased emphasis on nucleic acids (see, e.g., White et al.'s Principles of Biochemistry from 1954 through subsequent editions, e.g., White et al. 1978).

This brief recapitulation of the origins of molecular biology reflects themes addressed by philosophers, such as reduction (see Section 3.1) and the concept of the gene (see Section 2.3). For Schroedinger, biology was to be reduced to the more fundamental principles of physics, while Delbrueck instead resisted such a reduction and sought what made biology unique. Muller's shift from classical genetics to the study of gene structure raises the question of the relation between the classical and molecular concept of the gene. These issues will be examined below.

1.2 Classical Problems

Molecular biology's classical period began in 1953, with James Watson and Francis Crick's discovery of the double helical structure of DNA (Watson and Crick 1953a, 1953b). Watson and Crick's scientific relationship unified the various disciplinary approaches discussed above: Watson, a student of Luria and the phage group, recognized the need to utilize crystallography to elucidate the structure of DNA; Crick, a physicist enticed by Schroedinger's What is Life? to turn to biology, became trained in, and contributed to the theory of, x-ray crystallography. At Cambridge University's Cavendish Laboratory, Watson and Crick found that they shared interests in genes and the structure of DNA.

In the oft told story (Watson 1968), Watson and Crick collaborated to build a model of the double helical structure of DNA, with its two helical strands held together by hydrogen-bonded base pairs. They made extensive use of data from x-ray crystallography work on DNA by Maurice Wilkins and Rosalind Franklin at King's College, London (Maddox 2002; for a discussion of Franklin's work after her research on DNA see Creager and Morgan 2008), Crick's theoretical work on crystallography (Crick 1988), and the model building techniques pioneered by Pauling (de Chadarevian 2002; Judson 1996; Olby 1970, 1994, Forthcoming).

With the structure of DNA in hand, molecular biology shifted its focus to how the double helical structure aided elucidation of the mechanisms of genetic replication and function, the keys to understanding the role of genes in heredity. This subsequent research was guided by the notion that the gene was an informational molecule. The linear sequence of nucleic acid bases along a strand of DNA provided coded information for directing the linear ordering of amino acids in proteins (Crick 1958). The genetic code came to be characterized as the relation between a set of three bases on the DNA (“a codon”) and one of twenty amino acids, the building blocks of proteins. Attempts to unravel the genetic code included failed theoretical efforts, as well as competition between geneticists (Crick et al. 1961) and biochemists. An important breakthrough came in 1961 when biochemists Marshall Nirenberg and J. Heinrich Matthaei, at the (US) National Institutes of Health (NIH), discovered that a unique sequence of nucleic acid bases could be read to produce a unique amino acid product (Nirenberg and Matthaei 1961; for discussion, see Judson 1996; Kay 2000).

With the genetic code elucidated and the relationship between genes and their molecular products traced, it seemed in the late 1960s that the concept of the gene was secure in its connection between gene structure and gene function. The machinery of protein synthesis translated the coded information in the linear order of nucleic acid bases into the linear order of amino acids in a protein. However, such “colinearity” simplicity did not persist. In the late 1970s, a series of discoveries by molecular biologists complicated the straightforward relationship between a single, continuous DNA sequence and its protein product. Overlapping genes were discovered (Barrell et al. 1976); such genes were considered “overlapping” because two different amino acid chains might be read from the same stretch of nucleic acids by starting from different points on the DNA sequence. And split genes were found (Berget et al. 1977; Chow et al. 1977). In contrast to the colinearity hypothesis that a continuous nucleic acid sequence generated an amino acid chain, it became apparent that stretches of DNA were often split between coding regions (exons) and non-coding regions (introns). Moreover, the exons might be separated by vast portions of this non-coding, supposedly “junk DNA.” The distinction between exons and introns became even more complicated when alternative splicing was discovered the following year (Berk and Sharp 1978). A series of exons could be spliced together in a variety of ways, thus generating a variety of molecular products. Discoveries such as overlapping genes, split genes, and alternative splicing forced molecular biologists to rethink their understanding of what actually made a gene…a gene (Portin 1993; for a survey of such complications see Gerstein et al. 2007, Table 1).

These developments in molecular biology have received philosophical scrutiny. Molecular biologists sought to discover mechanisms (see Section 2.1), drawing the attention of philosophers to this concept. Also, conceptualizing DNA as an informational molecule (see Section 2.2) was a move that philosophers have subjected to critical scrutiny. Finally, the concept of the gene (see Section 2.3) itself has intrigued philosophers. Complex molecular mechanisms, such as alternative splicing, have obligated philosophers to consider to what the term “gene” actually refers.

1.3 Going Molecular

In a 1963 letter to Max Perutz, molecular biologist Sydney Brenner foreshadowed what would be molecular biology's next intellectual migration:

It is now widely realized that nearly all the “classical” problems of molecular biology have either been solved or will be solved in the next decade…. Because of this, I have long felt that the future of molecular biology lies in the extension of research to other fields of biology, notably development and the nervous system. (Brenner, letter to Perutz, 1963)

Along with Brenner, in the late 1960s and early 1970s, many of the leading molecular biologists from the classical period redirected their research agendas, utilizing the newly developed molecular techniques to investigate unsolved problems in other fields.

The discovery of coordinated gene regulation in bacteria by molecular biologists at first appeared to provide a general, theoretical model for transforming descriptive embryology into molecular developmental biology. Francois Jacob, Jacques Monod and their colleagues at the Institute Pasteur in Paris, discovered that three genes were coordinately controlled. Escherichia coli bacteria normally did not make enzymes for metabolizing the sugar in milk, but when placed in a medium with lactose as a food source, genes for metabolizing that sugar were induced. Work on induction showed that the inducer deactivated a repressor, a protein that was bound to the DNA and stopped synthesis of the messenger RNA that produced the enzymes. The group of coordinately controlled genes and their regulatory DNA sites was called an “operon” (Jacob and Monod 1961; discussed in Morange 1998, Ch. 14; Schaffner 1974a). At first, it was assumed that this deprepression model might prove to be the way, in general, that genes were controlled in organisms undergoing embryological development. As most cells in developing organisms seemed to have the same amount of DNA, it had long been a puzzle how they differentiated into the many different cell types in the body. However, further work showed that many different forms of gene regulation occurred other than by derepression. Nonetheless, molecular biology aided in embryology “going molecular”, as developmental biologists began the study of gene regulation during embryological development. That work continues today (see the entry on developmental biology).

In addition to developmental biology, the study of behavior and the nervous system lured some molecular biologists. Finding appropriate model organisms that could be subjected to molecular genetic analyses proved challenging. Returning to the fruit flies used in classical genetics, Seymour Benzer induced behavioral mutations in Drosophila as a “genetic scalpel” to investigate the pathways from genes to behavior (Benzer 1968; Weiner 1999). At Cambridge, Sydney Brenner developed the nematode worm, Caenorhabditis elegans, to study the nervous system, as well as the genetics of behavior (Brenner 1973, 2001; Ankeny 2000; Brown 2003). Nirenberg used neuroblastomas (malignant tumors composed of undifferentiated neurons) as a model system to study the development of neural tissue (for an online exhibit of Nirenberg's transition to neurobiology, see the National Library of Medicine's Profiles in Science study of Nirenberg, The Marshall W. Nirenberg Papers).

The techniques of molecular biology enabled numerous other fields to go molecular. The study of cells was transformed from descriptive cytology into molecular cell biology (Alberts et al. 1983; Alberts et al. 2002; Bechtel 2006). Molecular evolution developed as a phylogenetic method for the comparison of DNA sequences and whole genomes (Dietrich 1998). The immunological relationship between antibodies and antigens was recharacterized at the molecular level (Podolsky and Tauber 1997; Schaffner 1993; see also the entry on the biological notion of self and non-self). The study of oncogenes in cancer research is just one example of molecular medicine (Morange 1997). However, not all attempts to find the molecular basis of biological phenomena met with early success, such as the claim that RNA molecules coded memories (Morange 1998).

This expansion of molecular biology as other fields went molecular led some to distinguish molecular genetics from molecular biology. Some philosophers use “molecular genetics” synonymously with classical molecular biology (Kitcher 1984). Another usage of “molecular genetics” is to refer to any study of genetics at the molecular level, that is to any study of the molecular biology of the gene (see, for example, the entry on molecular genetics). Alternatively, “molecular genetics” may refer to macromolecular-level investigations of results produced by cross-breeding variants to produce hybrid organisms (a usage extended from techniques of traditional Mendelian genetics to genetic manipulations in bacteria).

1.4 Going Genomic...and Post-Genomic

In the 1970s, as many of the leading molecular biologists were migrating into other fields, molecular biology itself was going genomic (see the entry on genetics and genomics). The genome is a collection of nucleic acid base pairs within an organism's cells (adenine (A) pairs with thymine (T) and cytosine (C) with guanine (G)). The number of base pairs varies widely among species. For example, the flu-causing Haemophilus influenzae (the first genome to be sequenced) has roughly 1.9 million base pairs in its genome (Fleischmann et al. 1995), while the flu-catching Homo sapiens carries more than 3 billion base pairs in its genome (International Human Genome Sequencing Consortium 2001 [“Consortium” hereafter], Venter et al. 2001). The history of genomics is the history of the development and use of new experimental and computational methods for producing, storing, and interpreting such sequence data (Ankeny 2003).

Frederick Sanger played a seminal role in initiating such developments. Sanger developed protein sequencing techniques and used them to elucidate the amino acid sequence of the protein insulin in the mid-1950s. In 1962, Sanger began sequencing nucleic acids and developed increasingly improved techniques in the 1970s. Kary Mullis, inspired in part by Sanger's sequencing methodologies, developed polymerase chain reaction (PCR), a procedure wherein small samples of DNA were amplified (Saiki et al. 1985; for historical treatments of the discovery of PCR, see Rabinow 1996; Morange 1998). At Harvard, Allan Maxam and Walter Gilbert developed another sequencing method, which proved less efficient than Sanger's (Maxam and Gilbert 1977; a sequencing autobiography can be found in Sanger 1988; also see Culp 1995; de Chadarevian 2002; Judson 1992; Little 2003).

In the mid 1980s, after the development of sequencing techniques, the United States Department of Energy (DoE) originated a project to sequence the human genome (initially as part of a larger plan to determine the impact of radiation on the human genome induced by the Hiroshima and Nagasaki bombings). The resulting Human Genome Project (HGP) managed jointly by the DoE and the NIH, utilized both existent sequencing methodologies and introduced new ones (see the entry on the human genome project). Indeed, the now-famous controversy that emerged between the public Consortium of international sequencing centers and the private sequencing corporation Celera in their race to generate a “rough draft” of the human genome was predicated on different sequencing methodologies and debates over the accuracy and efficiency of such methodologies (Consortium 2001; Venter et al. 2001; for reflections on this race, see Carmen 2004; Cook-Deegan 1994; Davies 2001; Roberts 2001; Sulston and Ferry 2002; Venter 2002, 2007).

While the human genome project received most of the public attention, hundreds of genomes have been sequenced to date, including the cat (Pontius et al. 2007), the mouse (Waterson et al. 2002), and rice (Goff et al. 2002). One of the most shocking results of those sequencing projects was the total number of genes (defined in this context as stretches of DNA that code for a protein product) found in the genomes. The human genome contained 20,000 to 25,000 genes, the cat contains 20,285 genes, the mouse 24,174, and rice 32,000 to 50,000. So in contrast to early assumptions that gene-number correlated with organismal complexity, it turned out that neither organismal complexity nor even position on the foodchain was predictive of gene-number.

The successes of genomics have encouraged a number of disciplines to “go genomic”, including behavioral genetics (Plomin et al. 2003), developmental biology (Srinivasan and Sommer 2002), cell biology (Taniguchi et al. 2002), and evolution (Ohta and Kuroiwa 2002). What's more, genomics has been institutionalized with textbooks (Cantor and Smith 1999) and journals, such as Genomics and Genome Research. And the human genome project itself has turned its attention from a standardized human genome to variation between genomes in the form of the Human Genome Diversity Initiative (Gannett 2003).

But just as a number of disciplines “went molecular” while molecular biology itself was wrestling with the complexities posed by split genes and overlapping genes, so too are fields going genomic while genomics itself is wrestling with the complexities posed by how a mere 20,000 genes can construct a human while a grain of rice requires 50,000 genes. Thus, genomics is now supplemented by post-genomics. There is ongoing debate about what actually constitutes post-genomics (Morange 2006), but the general trend is a focus beyond the mere sequence of A's, C's, T's, and G's and instead on the complex, cellular mechanisms involved in generating such a variety of protein products from a relatively small number of protein-coding regions in the genome. Post-genomics utilizes the sequence information provided by genomics but then situates it in a systems-level analysis of all the other entities and activities involved in the mechanisms of transcription (transcriptomics), regulation (regulomics), and expression (proteomics).

Developments in genomics and post-genomics have sparked a number of philosophical questions about molecular biology. Since the genome requires a vast array of other mechanisms to facilitate the generation of a protein product, can DNA really be causally prioritized (see Section 2.3)? Similarly, in the face of such interdependent mechanisms involved in transcription, regulation, and expression, can DNA alone be privileged as the bearer of hereditary information, or is information distributed across all such entities and activities (see Section 2.2)? And how should this systems-level analysis be conceptualized? Are the systems of post-genomics simply a collection of mechanisms, or is there something epistemologically and metaphysically different between systems and mechanisms (see Section 2.1)?

2. Concepts in Molecular Biology

Key concepts in molecular biology are mechanism, information, and gene. For example, in a seminal paper announcing the discovery of messenger RNA (the intermediary between the DNA of the gene and the protein for which it carries information), Francois Jacob and Jacques Monod claimed,

The property attributed to the structural messenger of being an unstable intermediate is one of the most specific and novel implications of this scheme…This leads to a new concept of the mechanism of information transfer, where the protein synthesizing centers (ribosomes) play the role of non-specific constituents which can synthesize different proteins, according to specific instructions which they receive from the genes through M-RNA. (Jacob and Monod 1961, 353, emphasis added; for other examples of such language, see Morange 1998, 176; Davis 1980, 78)

Hence, major tasks for philosophers of molecular biology have been and continue to be analyzing the concepts of mechanism, information, and gene in order to understand how they have been, are, and should be used.

2.1 Mechanism

As the brief history of molecular biology showed, molecular biologists discover and explain by identifying and elucidating mechanisms, such as DNA replication, protein synthesis, and the myriad mechanisms of gene expression. The phrase “theory of molecular biology” was not used and for good reason; general knowledge in the field is represented by diagrams of mechanisms. Discovering the mechanism that produces a phenomenon is an important accomplishment for several reasons. First, knowledge of a mechanism shows how something works: elucidated mechanisms provide intelligibility. In some cases, one can literally see how the mechanism works from beginning to end. One can run a simulation in the mind's eye. Second, knowing how a mechanism works allows predictions to be made based upon the regularity in mechanisms. One may be able to say how a mechanism would work, if another instance is encountered, or if conditions or inputs are changed. Thirdly, knowledge of mechanisms potentially allows one to intervene to change what the mechanism produces, to manipulate its parts to construct experimental tools, or to repair a broken, diseased mechanism. In short, knowledge of elucidated mechanisms provides understanding, prediction, and control. Given the general importance of mechanisms and the fact that mechanisms play such a central role in the field of molecular biology, it is not surprising that philosophers of biology pioneered analyzing the concept of mechanism.

The new mechanistic perspective in philosophy of science developed, in part, in philosophy of molecular biology (as well as in studies of cell biology and neuroscience). As early as the 1970s, William Wimsatt (1972, 67) said, “At least in biology, most scientists see their work as explaining types of phenomena by discovering mechanisms...” In their seminal book, Discovering Complexity, William Bechtel and Robert Richardson (1993) investigated the roles of decomposition and localization as strategies for discovering mechanisms. Richard Burian (1993, 389) noted that molecular biology “mainly studies molecular mechanisms,” and expanded this perspective in Burian (2005).

Peter Machamer, Lindley Darden, and Carl Craver (2000) took this peripheral focus on mechanisms and brought it to center stage, providing a characterization of mechanisms as used in molecular biology and neurobiology. Machamer, Darden, and Craver proposed: “Mechanisms are entities and activities organized such that they are productive of regular changes from start or set-up to termination conditions.” In molecular biological mechanisms, types of entities include macromolecules (such as proteins and the nucleic acids, DNA and RNA), and sub-cellular structures, such as ribosomal particles (composed of RNA and proteins). Types of activities include geometrico-mechanical activities, such as lock and key docking of an enzyme and its substrate, and chemical bonding activities, such as the formation of strong covalent bonds and weak hydrogen bonds. The entities and activities are organized in productive continuity from beginning to end; that is, each stage gives rise to the next. Entities having certain kinds of activity enabling properties allow the possibility of acting in certain ways, and certain kinds of activities are only possible when there are entities having certain activity enabling properties (Darden 2002; Darden and Craver 2002). The organization of the entities and activities determines the ways in which they produce the phenomenon. Entities often must be appropriately located, structured, and oriented, and the activities in which they engage must have a temporal order, rate, and duration. To give a description of a mechanism for a phenomenon is to explain that phenomenon, i.e., to explain how it was produced (see also Glennan 2002; Tabery 2004).

A typical example of a molecular biological mechanism is the mechanism of DNA replication. As Watson and Crick (1953a) famously noted upon discovery of the structure of DNA, the macromolecule's structure pointed to the mechanism of DNA replication: “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.” In short, the double helix of DNA (an entity) unwinds (an activity) and new component parts (entities) bond (an activity) to both parts of the unwound DNA helix. DNA is a nucleic acid composed of several subparts: a sugar-phosphate backbone and nucleic acid bases. When DNA unwinds, the bases exhibit weak charges, properties that result from slight asymmetries in the molecules. These weak charges allow a DNA base and its complement to engage in the activity of forming hydrogen (weak polar) chemical bonds; the specificity of this activity is due to the geometric arrangements of the weak polar charges in the subparts of the base. Ultimately, entities with polar charges enable the activity of hydrogen bond formation. After the complementary bases align, then the backbone forms via stronger covalent bonding. The mechanism proceeds with unwinding and bonding together (activities) new parts, to produce two helices (newly formed entities) that are (more or less faithfully) copies of the parent helix.

Nucleic acids and proteins are among the working entities (Darden 2005) of molecular biological and biochemical mechanisms, such as DNA replication and protein synthesis. Smaller (or larger) entities do not have the requisite sizes, shapes, charges, or other activity-related properties to play their specific roles in these molecular mechanisms. Working entities may have localized active sites, such as the portion of the protein that serves as the site for the docking of a substrate to an enzyme. Alternatively, active sites may be distributed throughout the entity, as the slightly charged bases are distributed along the unwound DNA double helix. Of course, the double helix has subparts, such as the components of the atomic nuclei inside the atoms within the macromolecule; however, those are not the working entities in the mechanism of DNA replication. If nuclear fission were occurring, then atomic nuclei would be working entities, and DNA replication would certainly not be occurring.

Scientists rarely depict all the particular details when describing a mechanism; representations are usually schematic, often depicted in diagrams. Such representations may be called a “model of a mechanism,” or “mechanism schema.” A mechanism schema is a truncated abstract description of a mechanism that can be instantiated by filling it with more specific descriptions of component entities and activities. An example is James Watson's (1965) diagram of his version of the central dogma of molecular biology:

DNA → RNA → protein.

This is a schematic representation (with a high degree of abstraction) of the mechanism of protein synthesis, which can be instantiated with details of DNA base sequence, complementary RNA sequence, and the corresponding order of amino acids in the protein produced by the more specific mechanism. Molecular biological textbooks (e.g., Watson et al. 2007) are replete with diagrams of mechanism schemas. An examination of such texts makes clear that the general knowledge in molecular biology consists of a set of mechanism schemas, such as those for the mechanisms of DNA replication and repair, protein synthesis, and gene regulation. Many have very wide scope, applying to most living things.

A mechanism schema can be instantiated to yield a description of a particular mechanism. In contrast, a mechanism sketch cannot (yet) be instantiated; components are (as yet) unknown. Sketches have black boxes for missing components or grey boxes whose function is known but whose entities and activities that carry out that function are not yet elucidated. Such sketches guide new research to fill in the details.

Molecular biology, thus, is best understood as engaged in the discovery and elucidation of mechanisms. This insight offers the best formulation of key concepts in the field, such as information and the gene. In addition, this insight reconfigures traditional debates in the philosophy of science which have drawn on molecular biology. For example, the much-debated relationship between classical genetics and molecular biology is explicated as a focus on different, though serially-connected, mechanisms in the process of heredity (Darden 2006a, 2006b, see Section 3.1); likewise, an appreciation for mechanistic explanations in molecular biology accounts for biological explanations that incorporate both regularity and variation (Tabery 2009, see Section 3.2).

2.2 Information

As revealed in the history of molecular biology, the language of information is used ubiquitously by molecular biologists. Genes as linear DNA sequences of bases are said to carry “information” for the production of proteins. During protein synthesis, the information is “transcribed” from DNA to messenger RNA and then “translated” from RNA to protein. During DNA replication, and subsequent inheritance, it is often said that what is passed from one generation to the next is the “information” in the genes, namely the linear ordering of bases along complementary DNA strands. Historians of biology have tracked the entrenchment of information-talk in molecular biology, and philosophers of biology have questioned whether a definition of “information” can be provided that adequately captures its usage in the field.

According to the historian Lily Kay, “Up until around 1950 molecular biologists…described genetic mechanisms without ever using the term information” (Kay 2000, 328). “Information” replaced earlier talk of biological “specificity.” Watson and Crick's second paper of 1953 (1953b), which discussed the genetical implications of their recently discovered double-helical structure of DNA (1953a), announced: “…it therefore seems likely that the precise sequence of the bases is the code which carries the genetical information…” (Watson and Crick 1953b, 244, emphasis added).

In 1958, Francis Crick used and characterized the concept of information in the context of stating what he called the central dogma of molecular biology. Crick characterized the central dogma as follows:

This states that once ‘information’ has passed into protein it cannot get out again. In more detail, the transfer of information from nucleic acid to nucleic acid, or from nucleic acid to protein may be possible, but transfer from protein to protein, or from protein to nucleic acid is impossible. Information means here the precise determination of sequence, either of bases in the nucleic acid or of amino acid residues in the protein. (Crick 1958, 152–153, emphasis in original)

Note that, as characterized by Crick, information was not static in the way that, say, coded words on a page were static. Instead, Crick's characterization of information was dynamic; that is, it required a mechanism operating to carry out a task, i.e., “the precise determination of sequence” (Darden 2006a, 2006b). Crick also distinguished three different kinds of transfer or flow in the mechanism of protein synthesis: flow of information, flow of matter, and flow of energy.

As a molecular biologist, Crick explicitly focused his attention on flow of information, and not on flow of matter or energy. He discussed biochemical work dealing with matter and energy flow. Again we see one of the primary differences between molecular biology and biochemistry: molecular biology was concerned with genetic information and its role in protein synthesis. Crick emphasized that the nucleic acid sequences determined amino acid sequences and not vice versa. In 1958, it was still an open question how protein synthesis and nucleic acid synthesis operated. Crick's statement about the direction of information flow denied that amino acid sequence could determine the sequence of nucleic acid bases: the flow was one way, from genetic information to protein, but not back.

The central dogma did not go unchallenged. In 1970, an anonymous article in Nature, entitled “Central Dogma Reversed,” discussed the implications of the newly discovered enzyme, reverse transcriptase (Baltimore 1970; Temin and Mizutani 1970). In some viruses, whose genetic material was RNA, this enzyme was found to copy RNA into DNA, which was then inserted into the host genome. This reversal, the article claimed, challenged the “cardinal tenet of molecular biology” that “the flow or transcription of genetic information from DNA to messenger RNA and then its translation to protein is strictly one way” (Anonymous 1970, 1198). Crick (1970) responded that his statement of the central dogma had not been challenged by this finding. The principle problem to which the central dogma was addressed was the finding of “general rules for information transfer from one polymer [a long chain molecule] to another” (Crick 1970, 561). Crick pointed out that the denial of flow of information from proteins back to nucleic acid still held, even if the nucleic acid RNA could be complementarily copied to the nucleic acid DNA. A narrow scope mechanism schema, with a reverse arrow from RNA to DNA, was added to the more widely found DNA to RNA one, qualifying Watson's diagrammatic representation (discussed in Darden 1995).

In addition to its use in seminal papers, “information” can be found throughout textbooks in the field. Information is said to be “transferred” from DNA to RNA templates to proteins during protein synthesis (e.g., Watson 1965, 297). “Nucleic Acids Convey Genetic Information” is a chapter title in Watson et al.'s (1988, Ch. 3) textbook, Molecular Biology of the Gene. Again: “The translation of genetic information from the 4-letter alphabet of polynucleotides into the 20-letter alphabet of proteins is a complex process” (Alberts et al. 2002, 8).

It is important not to confuse the genetic code and genetic information. The genetic code refers to the relation between three bases of DNA, called a “codon,” and one amino acid. Tables available in molecular biology textbooks show the relation between 64 codons and 20 amino acids. For example, CAC codes for histidine. (For the table of the genetic code see, e.g., Watson et al. 1988, frontispiece, or in the entry on biological information.) Only a few exceptions for these coding relations have been found, in a few anomalous cases (see the list in a small table in Alberts et al. 2002, 814). In contrast, genetic information refers to the linear sequence of codons along the DNA, which (in the simplest case) are transcribed to messenger RNA, which are translated to linearly order the amino acids in a protein. Many exceptions to the colinearity hypothesis have been found (as discussed in Section 1.2 above), such as in split and overlapping genes.

The information concept (with its associated concepts of code, transcription, translation, reading frame, etc.) has indisputably played a major role in the history of molecular biology. The question for philosophers of biology is whether an analysis of the concept of information can capture this role. The usage of “information” in the mathematical theory of communication is too impoverished to capture the molecular biological usage (for critiques see Sarkar 1996b, 1996c; Sterelny and Griffiths 1999, 101–104) and the usage in cognitive neuroscience, with its talk of “representations” (e.g., Crick 1988, 154–155) may be said to be too rich. The coded sequences in the DNA are more than just a signal with some number of bits that may or may not be accurately transmitted, yet they are not said to have within them a representation of the structure of the protein. (The way in which the linear order of amino acids in a protein determine its three dimensional structure is still an unsolved problem; however, even if these rules were known, it is doubtful that molecular biology would use the language of “representation.”) No definition of “information” as it is used in molecular biology has yet received wide support among philosophers of biology.

Stephen Downes distinguishes three positions on the relation between information and the natural world:

  1. Information is present in DNA and other nucleotide sequences. Other cellular mechanisms contain no information.
  2. Information is present in DNA, in other nucleotide sequences and other cellular mechanisms, for example cytoplasmic or extra-cellular proteins; and in many other media, for example, the embryonic environment or components of an organism's wider environment.
  3. DNA and other nucleotide sequences do not contain information, nor do any other cellular mechanisms. (Downes 2006)

These options may be read either ontologically or heuristically. A heuristic reading of (1), for instance, views the talk of information in molecular biology as useful in providing a way of talking and in guiding research (Downes 2006). And so the heuristic benefit of the information concept can be defended without making any commitment to the ontological status (Sarkar 2000). Indeed, one might argue that a vague and open-ended use of information is valuable for heuristic purposes, especially during early discovery phases in the development of a field (for a similar discussion of the gene-concept, see Section 2.3).

Philosophers' discussions of the concept of information in biology has not sought its heuristic usage in discovery contexts but instead focus on its ontological reading. Three different philosophical accounts of information serve as exemplars of Downes' three categories. The goal is to see examples of differing positions on the issue of (a) whether DNA carries information from the perspective of its use in molecular biological mechanisms, such as DNA replication, transcription of messenger RNA, and protein synthesis, or, more broadly, (b) whether genes carry information for producing a phenotypic trait.

Take Downes' third category first. Kenneth Waters argues that information is a useful term in rhetorical contexts, such as seeking funding for DNA sequencing by claiming that DNA carries information. However, from an ontological perspective, Waters claims that explication of DNA's causal role has no need for the concept of information. Genes, he argues, should not be viewed as “immaterial units of information” (Waters 2000, 541). As discussed in Section 2.3 below, Waters' focus is on stretches of DNA whose causal roles are as actual specific difference makers in genetic mechanisms. On the unique causal role played by DNA sequences, as opposed to, say, different enzymes for synthesizing RNA, Waters says: “DNA is a specific difference maker in the sense that different changes in the sequence of nucleotides in DNA would change the linear sequence in RNA molecules in many different and very specific ways. RNA polymerase does not have this specificity. Intervening on RNA polymerase might slow down or stop synthesis of a broad class of RNA molecules, but it is not the case that many different kinds of interventions on RNA polymerase would change the linear sequence in RNA molecules in many different and very specific ways. This shows that DNA is a causally specific potential difference maker” (Waters 2007, Section 8). Talk of information is not needed; causal role function talk is sufficient. (For more on Waters' view see his entry on molecular genetics; for others who make similar points, see Sustar 2007; Weber 2005; 2006.)

Eva Jablonka (2002) is an example of Downes' second category. She argues that information is ubiquitous. She defines information as follows: a source becomes an informational input when an interpreting receiver can react to the form of the source (and variations in this form) in a functional manner. She claims a broad applicability of this definition. The definition, she says, accommodates information stemming from environmental cues as well as from evolved signals, and calls for a comparison between information-transmission in different types of inheritance systems — the genetic, the epigenetic, the behavioral, and the cultural-symbolic. Although her goal is to find a very general definition, the focus here will be on how well her definition applies to DNA as a source of information. She stresses the importance of organization in the source and the order of bases is certainly crucial to the information carried by DNA. She also notes that variations in the source lead to variations in the form of the response. Her example is from the very broad perspective of molecular developmental biology: “variations in DNA lead to variations in development.” One may add that in limiting discussion to DNA replication and protein synthesis, the same is true: variations in DNA base sequences (may) produce variations in the products produced (ignoring degeneracy of the genetic code in which some different codons still code for the same amino acid, a point that Jablonka does not make). She stresses, as did Crick, that what flows is information, neither matter nor energy. However, Jablonka's emphasizes the evolution of the “interpreting system of the receiver.” Presumably, for the molecular biology case, this is the machinery for translating messenger RNA into the linear order of amino acids in the protein, if the protein is to be considered a receiver. Although the evolution of the “interpreting system” (including ribosomes and transfer RNAs) was required for information in the DNA to be read, that is not typically the focus for understanding information in DNA. Nor does the protein produced during the reading of the coded sequence seem to be appropriately called the “receiver” of the information. (The term “receiver” applies much better to her case of the ape that interprets the dark sky as information about an approaching storm, which raises questions about the evolution of such ability in the ape.) On this view, as she explicitly claims, genes have no theoretically privileged informational status (Jablonka 2002, 583).

Downes' first category applies specifically to the usage of information in molecular biology: information is present in DNA and other nucleotide sequences but not in other cellular mechanisms. With a bit of a stretch, Ulrich Stegmann (2005) provides an example with his analysis of template-directed synthesis. Stegmann does explicitly allow that components other than nucleotide sequences might contain what he calls instructional information. However, his only example is a thought experiment involving enzymes linearly ordered along a membrane; nothing of the sort is known to actually exist or even seems very likely to exist. Furthermore, his analysis, with this caveat, explicates in a precise way the concept of information that has played such an important role in molecular biology, namely Crick's (1958) “the precise determination of sequence, either of bases in the nucleic acid or of amino acid residues in the protein.” Stegmann calls this the sequentialization view. Stegmann's instructional account of genetic information requires that the component carrying the information satisfy the following conditions: an advance specification of the kind and order of steps that yield a certain outcome if the steps are carried out. On his account, DNA qualifies as an instructional information carrier for replication, transcription and translation. The sequence of bases provides the order. The hydrogen bonding between specific bases and the genetic code provide the specific kinds of steps. Stegmann clearly distinguishes sequentialization from coding. DNA carries information for the sequence of bases during DNA replication and during transcription of messenger RNA, namely for the order of the nucleic acid bases. The genetic code, in contrast, provides the kind of relation between a codon (three such bases) and a specific amino acid during the translation of mRNA during protein synthesis. The mechanisms of replication, transcription and translation yield certain outcomes: a copy of the DNA helix, an mRNA, and (in bacteria, with no splicing and editing of mRNA) a linear order of amino acids. The requirement of advance specification explicates the idea that DNA stores information that might not be in current use and aids in distinguishing those mechanisms with a flow of information from those with none, such as the Krebs cycle. Because DNA carries information about an outcome, his instructional account qualifies as an intentional (something is about something else) view. (Stegmann (2009) argues that Sarkar's (2005) semiotic account does not adequately account for such intentionality.) Also, because DNA carries information for a specific outcome, an error can occur as the mechanism operates to produce that outcome; hence Stegmann's account allows for errors and error-correcting mechanisms (such as proof reading mechanisms that correct DNA mutations).

Stegmann's instructional account of genetic information seems to be the best so far proposed to capture Crick's (1958) usage that has played a role in molecular biology's claim that DNA sequences carry genetic information. He explicitly notes that his analysis of genetic information applies only to DNA's sequentialization role, not on the issue of whether DNA carries information for phenotypic traits, which involve numerous other causal factors in the mechanisms that produce them. (For a similar view of the role of the informational framework in foregrounding sequence properties, see Godfrey-Smith 2007.)

Philosophical work continues, first, to find an adequate characterization of “information” as it is used in molecular biology; second, to distinguish mechanisms in which information is said to be transferred (such as DNA replication and protein synthesis) from those in which it is not (such as many metabolic reactions); and third, to answer the question of whether something appropriately called “information” is to be found in molecules and mechanisms.

For more on information, see the entry on biological information.

2.3 Gene

The question of whether classical genetics could be (or already has been) reduced to molecular biology (to be taken up below) motivated philosophers to consider the connectibility of the term they shared: the gene. Investigations of reduction and scientific change raised the question of how the concept of the gene evolved over time, figuring prominently in C. Kenneth Waters' (1990, 1994, 2007, see entry on molecular genetics), Philip Kitcher's (1982, 1984) and Raphael Falk's (1986) work. Over time, however, philosophical discussions of the gene concept took on a life of their own, as philosophers raised questions independent of the reduction debate: What is a gene? And, is there anything causally distinct about DNA?

Falk (1986) explicitly asked philosophers and historians of biology, “What is a Gene?” Falk drew on Kenneth MacCorquodale and Paul E. Meehl's distinction between quantities that can be obtained by manipulating values of empirical variables without hypothesizing the existence of unobserved entities or processes (dubbed “intervening variables”) and concepts which assert the existence of entities and the occurrence of events not reducible to the observable (dubbed “hypothetical constructs”) (MacCorquodale and Meehl 1948). Employing this distinction, Falk claimed that the gene began as an intervening variable but morphed into a hypothetical construct with Morgan's chromosomal theory of inheritance and then with molecular biology, when the gene became equated with a sequence of DNA.

Discoveries such as overlapping genes, split genes, and alternative splicing (discussed in Section 1.2) made it clear that simply equating a gene with an uninterrupted stretch of DNA would no longer capture the complicated molecular-developmental details of mechanisms such as gene expression (Downes 2004). In light of the enormous complexity found in the process of moving from a stretch of DNA to a protein product, Falk's (1986) question persists: What is a gene? Two general trends have emerged in the philosophical literature to answer this question and to accommodate the molecular-developmental phenomena: first, distinguish multiple gene concepts to capture the complex structural and functional features separately, or second, rethink a unified gene concept to incorporate such complexity.

A paradigmatic example of the first line came from Lenny Moss's distinction between Gene-P and Gene-D (Moss 2001, 2002). Gene-P embraced an instrumental preformationism (providing the “P”); it was defined by its relationship to a phenotype. In contrast, Gene-D referred to a developmental resource (providing the “D”); it was defined by its molecular sequence. An example will help to distinguish the two: Cystic fibrosis is one of the most common genetic diseases affecting populations of Western European descent. The disease results from an abnormality in cellular membrane proteins that function to transport chloride between cells and the extracellular fluid (for an overview of this research, see Collins 1992). Individuals receive two copies of the cystic fibrosis transmembrane conductance regulator (CFTR) gene, one from each parent. If an individual receives two mutated copies of this gene, then they will lack the resources necessary to transport chloride, and an imbalance in extracellular chloride will result, generating the tell-tale mucus that coats victims' cells, potentially generating deadly infections. When one talked about the gene for cystic fibrosis, the Gene-P concept was being utilized; the concept referred to the ability to track the transmission of this gene from generation to generation as an instrumental predictor of cystic fibrosis, without being contingent on knowing the causal pathway between the particular sequence of DNA and the ultimate phenotypic disease. The Gene-D concept, in contrast, referred instead to just one developmental resource (i.e., the molecular sequence) involved in the complex development of the disease, which interacted with a host of other such resources (proteins, RNA, a variety of enzymes, etc.); Gene-D was indeterminate with regards to the ultimate phenotypic disease. Moreover, in cases of other diseases where there are different disease alleles at the same locus, a Gene-D perspective would treat these alleles as individual genes, while a Gene-P perspective treats them collectively as “the gene for” the disease. (For another example of a gene-concept divider, see Keller's distinction between the gene as a structural entity and the gene as a functional entity (Keller 2000, 70–72).)

A second philosophical approach for conceptualizing the gene involved rethinking a single, unified gene concept that captured the molecular-developmental complexities. For example, Eva Neumann-Held (Neumann-Held 1999, 2001; Griffiths and Neumann-Held 1999) claimed that a “process molecular gene concept” (PMG) embraced the complicated developmental intricacies. On her unified view, the term “gene” referred to “the recurring process that leads to the temporally and spatially regulated expression of a particular polypeptide product” (Neumann-Held 1999). Returning to the case of cystic fibrosis, a PMG for an individual without the disease referred to one of a variety of transmembrane ion-channel templates along with all the epigenetic factors involved in the generation of the normal polypeptide product. And so cystic fibrosis arose when a particular stretch of the DNA sequence was missing from this process. (For another example of a gene-concept unifier, see Falk's discussion of the gene as a DNA sequence that corresponded to a single norm of reaction for various molecular products based on varying epigenetic conditions (Falk 2001).)

Philosophers and historians of biology have not yet reached a consensus in answer to Falk's (1986) question: what is the gene? This fact has elicited a range of reactions. Rheinberger (2000) agreed that the gene concept was fuzzy but welcomed the imprecision; the gene was fruitful as an object of research in flux because the concept also remained operationally in flux (see Rheinberger and Mueller-Wille's entry on gene). Likewise, Weber (2005) has described the gene as having a “floating reference” that has allowed the concept to evolve over time. Paul Griffiths, meanwhile, in a review of the volume in which Rheinberger's essay appears, deemed the gene concept “Lost” but offered a “Reward to Finder” (Griffiths 2002; Beurton, Falk, and Rheinberger 2000). In fact, Griffiths and Karola Stotz are currently leading a philosophical search party of sorts. The “Representing Genes” project includes a group of philosophers and historians of biology who are attempting to operationalize some of the various philosophical claims about the gene concept discussed above and then test those claims (Griffiths and Stotz 2004, 2006; Stotz, Griffiths, and Knight 2004). The Representing Genes project can be monitored at its website: Representing Genes: Testing Competing Philosophical Analyses of the Gene Concept in Contemporary Molecular Biology.

Relatedly, philosophers have also debated the causal distinctiveness of DNA. Consider again the case of cystic fibrosis. A stretch of DNA on chromosome 7 is involved in the process of gene expression, which generates (or fails to generate) the functional product that transports chloride. But obviously that final product results from that stretch of DNA as well as all the other developmental resources involved in gene expression, be it in the expression of the functional protein or the dysfunctional one. Thus, a number of authors have argued for a causal parity thesis, wherein all developmental resources involved in the generation of a phenotype such as cystic fibrosis are treated as being on par (Griffiths and Knight 1998; Robert 2004); Stotz (2006), in particular, has pointed to the complications of post-genomics to make this point (see Section 1.4 on post-genomics).

Waters (2007, see also his entry on molecular genetics), in reply, has argued that there is something causally distinctive about DNA. Causes are often conceived of as being difference makers, in that a variable (i.e., an entity or activity in a mechanism) can be deemed causal when a change in the value of that variable would counterfactually have led to a different outcome (see the entry on scientific explanation). According to Waters, there are a number of potential difference makers in the mechanisms involved in developing or not developing cystic fibrosis; that is, an individual with two normal copies of the CFTR gene could still display signs of cystic fibrosis if a manipulation was done to the individual's RNA polymerase (the protein responsible for transcribing DNA to RNA), thereby undermining the functional reading of the stretch of DNA. So RNA polymerase is a difference maker in the development or lack of development of cystic fibrosis, but only a potential difference maker, since variation in RNA polymerase is not commonly identified as playing a role in the development or lack of development of cystic fibrosis in natural populations. The stretch of DNA on chromosome 7, however, is an actual difference maker. That is, there are actual differences in natural human populations on this stretch of DNA, which lead to actual differences in developing or not developing cystic fibrosis; the functional stretch of DNA is 230,000 base pairs long and generates a functional protein that is 1,480 amino acids long, but the most common mutation involves a deletion of three nucleotide bases in the stretch of DNA leading to a missing amino acid at the 508th position along the amino acid chain. DNA is causally distinctive, according to Waters, because it is an actual difference maker. Advocates of the parity thesis are thus challenged to identify the other resources (in addition to DNA) that are actual difference makers.

From the mechanistic perspective, Waters' concept of an actual difference maker points to a segment of DNA that actually plays a role in a gene expression mechanism that makes a difference to a phenotypic outcome. That DNA segment might be a coding region, making a difference in the coding of the information for determining the amino acid sequence in a protein, or it might be a regulatory sequence, making a difference in whether a coding region is copied to mRNA or not. The key is to understand how explanations of variation in outcomes are provided by molecular biology in the form of elucidated causal mechanisms that contain actual difference makers. Tabery (2009) refers to these as “difference mechanisms”.

3. Molecular Biology and General Philosophy of Science

In addition to analyzing key concepts in the field, philosophers have employed case studies from molecular biology to address more general issues in the philosophy of science. The issue of reduction was addressed by considering whether classical genetics had been reduced to molecular biology. Cases from molecular biology have also been used to analyze the relationship between laws and explanation (see also the entry on philosophy of biology). For each of these philosophical issues, it will be argued, evidence from molecular biology directs philosophical attention toward understanding the concept of a mechanism for addressing the topic.

3.1 Theory Reduction and Integration of Fields

Reflecting on the historical origins of molecular biology discussed above, it should come as no surprise that the field appeared to many philosophers of science to offer an ideal case of reduction. Molecular biology emerged out of the search for the structure and function of the gene, so might the older field of classical genetics be (or have been) simply reduced to a successor—molecular biology?

Classical genetics had two laws, which, at first, seemed likely candidates for reduction to (derivation from) molecular laws. Based on patterns of inheritance of characters during breeding experiments, classical geneticists inferred regularities in the behavior of genes. These regularities were captured in Mendel's laws of segregation and independent assortment of genes in different linkage groups (as described in Section 1.1 above). The formal reduction of classical genetics to molecular biology required that these classical laws be logically deduced from laws of molecular biology. However, it was not possible to identify anything in molecular biology that was called a “law” or that played a role sufficient to allow logical derivation of Mendel's laws. Alternative analyses of the relation between classical genetics and molecular biology have included claims of replacement, informal reduction, explanatory extension, relations among practices, as well analysis of different mechanisms investigated in the two fields.

Kenneth Schaffner used and developed Ernst Nagel's (1961) analysis of derivational theory reduction to argue for the reduction of classical Mendelian genetics (T2) to molecular biology (T1) and refined it over many years (summarized in Schaffner 1993). The goal of formal reduction was to logically deduce the laws of classical genetics (or its improved successor, “modern transmission genetics” T2*) from the laws of molecular biology. Such a derivation required that all the terms of T2* not in T1 had to be connected to terms in T1 via correspondence rules. Hence, Schaffner endeavored to find molecular equivalents of such terms as “gene,” as well as predicate terms, such as “is dominant.”

David Hull (1974) criticized formal reduction, argued against Schaffner's claims, and suggested, instead, that perhaps molecular biology replaced classical genetics. Hull's critiques focused on the problem of connectibility of terms. One of his objections is called the “many-many” objection because of the many relations between a Mendelian term and a molecular term, and vice versa.

However, such connectibility of terms and logical derivation of laws of one theory from that of another, required by formal reduction, were peripheral to the concerns of scientists, as both Schaffner (1974b) and Hull (1974) realized. The idealized formal reduction relation, even if it could have been imposed on some version of the historically developing fields (or some logical reconstruction of their theories), did not serve to capture the practice of scientists. (Van Regenmortel and Hull (2002) contains subsequent papers on this debate.)

Darden and Maull (1977) focused attention on the bridges between fields as an important locus of new discoveries in science. The bridges might be identities (required of correspondence rules in the formal reduction model), but they might also be other kinds of relations. Interfield relations included part-whole relations (e.g., genes are parts of chromosomes), structure-function relations (e.g., an identified molecule functions as the repressor in gene regulation), and cause and effect relations. Sometimes the relations were elaborated in an “interfield theory,” such as the chromosome theory of Mendelian heredity (genes as parts of chromosomes). What was important was to find the interfield relations, not to formally derive anything from anything.

Kitcher (1984; 1989; 1999) and Waters (1990) advanced the discussion about the relations between the fields. Kitcher criticized a reductive approach. Waters defended informal aspects of reduction. Utilizing an analysis of theory structure in terms of argument schemata (see Section 3.2), Kitcher argued that the relation between Mendelian and molecular genetics was “explanatory extension” (Kitcher 1984, 371). The theory of molecular genetics provided a refined and expanded set of premises when compared to the argument schemata of classical genetics (Kitcher 1989, 440–442). However, classical genetics retained its own schema. For example, the independent assortment of genes in different linkage groups (the amended version of Mendel's second law) was explained, according to Kitcher, by instantiating a pairing and separation schema, thereby showing that chromosomal pairing and separation was a unifying natural kind (Kitcher 1999). Such unification would be lost if attention were focused on the gory details at the molecular level. The cytological level thus constituted an “autonomous level of biological explanation” (Kitcher 1984, 371). On the other hand, to solve problems of gene replication, mutation, and action, the gory molecular details were required, and were part of the expanded premise set of the schema labeled “Watson-Crick” (Kitcher 1989, 441). Waters (1990) countered that the gory molecular details of chromosomal mechanisms were informative. Waters thus defended “informal reduction,” in which molecular models of crossing-over between homologous chromosomes were shown to be explanatory, even though no derivational reduction was involved.

In subsequent work, Waters (2004; 2008) argued that philosophical discussion of the relations between Mendelian genetics and molecular biology had been too focused on relations between theories. Instead of theory reduction, he argued that the relation should be analyzed, in part, as consisting of different “investigative strategies,” that are currently in use. The “genetic approach,” which originated in classical genetics, requires recombining genetic mutations. After the discovery of DNA, genes could be used as “investigative levers” in functional analyses to understand how molecular parts contribute to the capacities of systems.

In a different way of changing the reduction debate, Alex Rosenberg (1997; 2006) argued for a shift away from relations between Mendelian genetics and molecular biology. Instead, he controversially divided biology into molecular biology and everything else, which he dubbed “functional biology.” In a manifesto in favor of molecular biology, he said: “Reductionism is the thesis that biological theories and explanations that employ them do need to be grounded in molecular biology and ultimately physical science, for it is only by doing so that they can be improved, corrected, strengthened, and made more accurate and more adequate and completed” (Rosenberg 2006, p. 4).

Hence, the task of explanatory reduction (in contrast to intertheoretic reduction), according to Rosenberg, is to explain all functional biological phenomena via molecular biology. He argued that embryology is being successfully reduced to molecular biology in contemporary molecular developmental biology. Idiosyncratically, he also argued that a next task is to find a way of reducing explanations based on the principle of natural selection. In his Darwinian Reductionism (2006), he argued that “how-possibly” functional (=adaptive) explanations need to be turned into “why necessary” explanations via macromolecular details of the underlying genetic and biochemical pathways. (For a critical assessment, see Weber 2008.)

For example, suppose one wishes to explain why DNA has thymine and why messenger RNA, transfer RNA, and ribosomal RNA have uracil. Rosenberg argued that DNA was selected to solve the problem of high fidelity information storage, while RNA was selected for low-cost information transmission and protein synthesis. Uracil is cheaper to synthesize than thymine, because thymine has a methyl group that uracil lacks. Hence, it is more efficient to quickly synthesize uracil and quickly synthesize labile RNAs. Thymine has other advantages in resisting mutations within the more stable DNA. The macromolecular details aid in explaining why these different molecules were selected for their functional roles exhibiting stability within DNA, on the one hand, and efficient rapid synthesis of RNA, on the other.

Although knowledge about molecular structure can provide insight into the evolution of genetic mechanisms, the claim that molecular biology can supply “why-necessary” explanations is too strong. Some other molecules might have played these functional roles, but the particular properties they have aids in understanding why this is how actually the mechanism does operate.

Critics and defenders of Rosenberg's view that developmental molecular biology is an example of successful reduction (explanation) of descriptive embryological generalizations discussed organizational and contextual features not captured by molecular biological principles. These included orientation of the embryo in the earth's gravitational field and other spatial, regulatory, and dynamical properties of developing systems (e.g., see Frost-Arnold 2004; Keller 1999; Laubichler and Wagner 2001; Love et al. 2008; Robert 2001, 2004). Delehanty (2005) countered the context objection in a more limited sense of reduction (not of general principles but in specific instances of “token-token” reduction) by introducing the concept of mechanism extension in which aspects of a mechanism's context are added to the description of the mechanism.

As early as 1976, Wimsatt (1976) argued for a shift in the reduction debate from talk of relations between theories to talk of decompositional explanation via mechanisms. Looking back at the earlier Schaffner/Hull debates from the perspective of the importance of mechanisms showed that their dispute hinged on debates about mechanisms of gene expression. Hull said:

Even if all gross phenotypic traits are translated into molecularly characterized traits, the relations between Mendelian and molecular predicate terms express prohibitively complex, many-many relations. Phenomena characterized by a single Mendelian predicate term can be produced by several different types of molecular mechanisms. Hence any reduction will be complex. Conversely, the same types of molecular mechanisms can produce phenomena that must be characterized by different Mendelian predicate terms. Hence, reduction is impossible. (Hull 1974, 39, emphasis added)

Schaffner criticized Hull for claiming that the same molecular mechanism could give rise to phenomena labeled with different Mendelian terms. “Different molecular mechanisms can appropriately be invoked in order to account for the same genetically characterized relation, as the genetics is less sensitive. The same molecular mechanisms can also be appealed to in order to account for different genetic relations, but only if there are further differences at the molecular level,” such as different initial conditions (Schaffner 1993, 444). Empirical investigation was required to determine the molecular mechanisms of gene expression (Schaffner 1993, 439). Furthermore, in subsequent work, Schaffner (2006) abandoned his earlier commitment to “intertheoretic reduction” in biology and instead advocated “explanatory reduction.” The latter, he claimed, was not sweeping but patchy and creeping. He argued that explanatory reductions consist of local decomposition-of-parts mechanistic explanations, which appeal to causal generalizations.

An alternative to the decompositional view of mechanisms analyzed hereditary mechanisms in terms of their entities and activities that bottom out at different size levels (Machamer, Darden, Craver 2000). According to Darden (2005), the fields of classical genetics and molecular biology investigated serially connected mechanisms with different working entities (often at different size levels) that operate at different times in (what is now known to be) an integrated temporal series of hereditary mechanisms. Previous philosophical accounts of the relations between Mendelian genetics and molecular biology, she argued, missed the importance of these temporal relations among different mechanisms.

Classical geneticists discovered the “mechanism of Mendelian heredity,” (Morgan et al. 1915). This mechanism, which produced the phenomena of gene segregation and independent assortment of different linkage groups, was discovered, not by decomposing genes into their parts, but by finding the wholes on which the parts were riding. Genes, Mendelian geneticists showed, were parts of chromosomes. The phenomena sketched in Mendel's laws of segregation and independent assortment were (and are) explained by the behavior of chromosomes because the chromosomes were found to be the working entities of the mechanisms of chromosomal pairing and separation during the formation of germ cells. Random assortment of the maternal and paternal chromosomes during pairing serves to independently assort different linkage groups (genes along the same chromosome). Pairing is followed by separation such that one of each pair goes into a separate germ cell, serving to segregate the paired alleles. Attention to the timing of these mechanisms showed that, curiously, the mechanism giving rise to the phenomena sketched in Mendel's second law of independent assortment, occurs before that giving rise to Mendel's first law of segregation.

Furthermore, Darden (2005) argued that molecular biologists discovered different mechanisms that operate before and after chromosomal pairing and separation. These mechanisms were found to have different working entities of different sizes, and required molecular techniques rather than Mendelian/cytological ones to find them. The discovery of the DNA double helix allowed the elucidation of the mechanism of gene reproduction and aided in understanding mechanisms of gene expression. Molecular biologists showed that DNA replication (with occasional errors producing mutations) was a first step in chromosomal duplication, which occurred prior to pairing and separation of the chromosomes. Gene expression, leading to the production of phenotypic characters, occurred after mating. Gene expression included the mechanism of protein synthesis, with the transcription of DNA sequence into messenger RNA and the translation of RNA in the amino acid sequence of proteins. Gene expression also included the many mechanisms of gene regulation operative during the expression of genes. All the mechanisms of gene expression operating during embryological development are still being sought. Mechanisms of DNA replication and protein synthesis showed remarkable unity in all living things, both in those with organized chromosomes and those without (such as bacteria), in sexually and asexually breeding organisms, and in plants and animals.

The working entities differed in the hereditary mechanisms discovered by classical genetics and molecular biology. The working entities of these temporally successive mechanisms were found to be the entire DNA double helix in DNA replication, the chromosomes during germ cell formation (with most genes packed away and inactive), and segments of DNA (genes) playing roles in the mechanisms of gene expression, such as protein synthesis and gene regulation.

Integration of a temporal series of hereditary mechanisms with different working entities, Darden (2005) argued, was the appropriate way to characterize the relations between classical genetics and molecular biology. This mechanistic analysis better captured, she claimed, the practice of biologists, with their frequent talk of mechanisms, than the analyses of the relations between the fields in terms of formal derivational theory reduction, informal reduction, replacement, related investigative strategies, and explanatory extension via expanded argument schemata. Progress in twentieth century genetics occurred by discovering mechanisms with different working entities of different sizes and integrating these mechanisms into a temporal series of hereditary mechanisms.

In summary, philosophers of biology agree that the relations between Mendelian genetics and molecular biology are not appropriately analyzed via derivational intertheoretic reduction. The debate has moved beyond the question of whether the two fields are related via reduction or antireduction. Darden (2006a) argued that the relations are best viewed from the perspective of the relations among the different, but temporally related, hereditary mechanisms. Waters (2008) argued that the relations are best viewed from the perspective of different investigative strategies employed in the two fields. Combining Darden's and Waters' insights, the question is how different mechanisms discovered with different investigative strategies are related.

3.2 Explanation in Molecular Biology

Traditionally, philosophers of science took successful scientific explanations to result from derivation from laws of nature (see the entries on laws of nature and scientific explanation). On this deductive-nomological account (Hempel and Oppenheim 1948), an explanation of particular observation statements was analyzed as subsumption under universal (applying throughout the universe), general (exceptionless), necessary (not contingent) laws of nature plus the initial conditions of the particular case. Philosophers of biology have criticized this traditional analysis as inapplicable to biology, and especially molecular biology.

Since the 1960s, philosophers of biology have questioned the existence of biological laws of nature. J. J. C. Smart (1963) emphasized the earth-boundedness of the biological sciences (in conflict with the universality of natural laws). No purported “law” in biology has been found to be exceptionless, even for life on earth (in conflict with the generality of laws). And John Beatty (1995) argued that the purported “laws” of, for example, Mendelian genetics, were contingent on evolution (in conflict with the necessity of natural laws). (For further discussion, see Brandon 1997; Mitchell 1997; Sober 1997; Waters 1998.) Hence, philosophers' search for biological laws of nature, characterized as universal, necessary generalizations, has ceased.

Without traditional laws of nature from which to derive explanations, philosophers of biology have been forced to rethink the nature of scientific explanation in biology and, in particular, molecular biology. Two accounts of explanation emerged: the unificationist and the causal-mechanical. Philip Kitcher (1989, 1993) developed a unificationist account of explanation, and he and Sylvia Culp explicitly applied it to molecular biology (Culp and Kitcher 1989). Among the premises of the “Watson-Crick” argument schema were “transcription, post-transcriptional modification and translation for the alleles in question,” along with details of cell biology and embryology for the organisms in question (Kitcher 1989, 440–442). An explanation of a particular pattern of distribution of progeny phenotypes in a genetic cross resulted from instantiating the appropriate schema: the variables were filled with the details from the particular case and the conclusion derived from the premises. This unificationist tradition was thus a descendant of the deductive-nomological view in which explanation resulted from deduction of the conclusion from initial conditions and generalizations (general premises for Kitcher).

Working in the causal-mechanical tradition pioneered by Wesley Salmon (1984, 1998), other philosophers turned to understanding mechanism elucidation as the avenue to scientific explanation in biology (Bechtel and Abrahamsen 2005; Bechtel and Richardson 1993; Craver 2007; Darden 2006a; Glennan 2002; Machamer, Darden, and Craver 2000; Sarkar 1998; Schaffner 1993; Woodward 2002). There are differences between the various accounts of a mechanism. Some philosophers have emphasized the decomposition of a complex system into interacting parts (Bechtel and Richardson 1993; Glennan 2002; Sarkar 1998), while others have not; some philosophers have retained the importance of subsumption under general rules (Sarkar 1998; Schaffner 1993), while others have not; and different philosophers have characterized in different manners the way in which the parts of a mechanism behave — as a function (Bechtel and Abrahamsen 2005), an activity (Machamer, Darden, and Craver 2002), an interaction (Glennan 2002; Woodward 2002), and an interactivity (Tabery 2004). But they hold in common the basic idea that a scientist provides a successful explanation by identifying and manipulating variables in a regular causal mechanism thereby determining how those variables are situated in and make a difference in the mechanism; the ultimate explanation amounts to the elucidation of how those variables act and interact to produce the phenomenon under investigation. As mentioned above (see Section 2.1), an elucidated mechanism allows for the explanatory features of understanding, prediction, and control.

There are several virtues of the causal-mechanical approach to understanding scientific explanation in molecular biology. For one, it is truest to molecular biologists' own language when engaging in biological explanation. Molecular biologists rarely describe their practice and achievements as the development of new theories; rather, they describe their practice and achievements as the elucidation of molecular mechanisms (Craver 2001; Machamer, Darden, Craver 2000). In a 1980 review of the achievements of molecular biology, Bernard Davis highlighted “the extraordinary unity in the molecular mechanisms underlying the rich diversity of biology.” (Davis 1980, 78) Michel Morange characterized the heart of molecular biology as consisting in the understanding of the “mechanisms of information exchange within cells.” (Morange 1998, 176) And Ahmad Hariri and Andrew Holmes write of the molecular genetic causes of psychiatric disorders, “Identifying biological mechanisms through which genes lead to individual differences in emotional behavior is paramount to our understanding of how such differences confer risk for neuropsychiatric illness.” (Hariri and Holmes 2006, 182)

Another virtue of the causal-mechanical approach is that it captures biological explanations of both regularity and variation. Unlike in physics, where a scientist assumes that an electron is an electron is an electron, a biologist (as the Hariri and Holmes quote just above conveys) is often interested in precisely what makes one individual different from another, one population different from another, or one species different from another. Philosophers have extended the causal-mechanical account of explanation to cover biological explanations of variation, be it across evolutionary time (Calcott 2009) or across individuals in a population (Tabery 2009). Tabery (2009) characterized biological explanations of variation across individuals in a population as the elucidation of “difference mechanisms”. Difference mechanisms are regular casual mechanisms made up of difference-making variables, one or more of which are actual difference makers (see Section 2.3 for the discussion of Waters' (2007) concept of an actual difference maker). There is regularity in difference mechanisms; interventions made on variables in the mechanisms that change the values of the variables lead to different outcomes in the phenomena under investigation. There is also variation in difference mechanisms; interventions need not be taken to find differences in outcomes because, with difference mechanisms, some variables are actual difference makers which already take different values in the natural world, resulting in natural variation in the outcomes.

In addition to these virtues, the causal-mechanical approach also captures the consolidation of explanations across biology emphasized by the unificationist approach. In contrast to Kitcher's unificationist approach to explanation via instantiation of argument schemata, on the causal-mechanical approach, unification occurs via instantiation of the same abstract mechanism schema within a domain. Some mechanism schemas have a domain of wide scope, such as the schema: DNA → RNA → protein. In contrast, another schema has very narrow scope of applicability, namely the schema for retroviruses: RNA → DNA. One unifies to the extent that empirical investigation shows the same mechanism schema can appropriately be instantiated with a given domain, of whatever scope. Unification and explanation are two separate issues. One explains a phenomenon by elucidating the mechanism that produces it.

4. Conclusion

An overview of the history of molecular biology revealed the original convergence of geneticists, physicists, and structural chemists on a common problem: the structure and function of the gene. Conceptual and methodological frameworks from each of these disciplinary strands united in the ultimate determination of the double helical structure of DNA (conceived of as an informational molecule) along with the mechanisms of gene replication, mutation, and expression. With this recent history in mind, philosophers of molecular biology have examined the key concepts of the field: mechanism, information, and gene. Moreover, molecular biology has provided cases for addressing more general issues in the philosophy of science, such as reduction and the integration of fields, and the nature of explanation without laws of nature in biology. It has been argued that, given the importance of the discovery of macromolecular mechanisms throughout the history of molecular biology, a philosophical focus on mechanisms generates the clearest picture of its history, of its concepts, and of the cases from its past utilized by philosophers of science.

Bibliography

Other Internet Resources

Related Entries

biology: philosophy of | causation: and manipulability | chemistry, philosophy of | developmental biology | gene | genetics: and genomics | genetics: molecular | heritability | human genome project | information: biological | laws of nature | life | reduction, scientific: in biology | scientific explanation | self: the biological notion of