Dawkins the weasel, and a program

November 12, 2009 - Leave a Response

Darwinian Evolutionists have been quick to point to Dawkins Weasel (a.k.a cumulative selection) whenever an Intelligent Design proponent states that Evolution is nothing but a random walk. Here we explore their precious algorithm for supposedly being the foundation to designing (if we could call it that) all the irreducible and complex specified information processing systems we find in biology, hence; creating the illusion of design or an intelligent designer at work. We will go through this simple algorithm that demonstrates cumulative selection on literal strings. x mutates at each iteration, with a constant supplied mutation rate and a fitness function which tests the current x state against a next mutated y state. The program doesn’t explicitly mention what its modeling. According to Dawkins the example is meant to demonstrate the power of cumulative selection, but really the literal string being tested could be anything. It could represent a single offspring relative to another or it could be a population of offspring relative to another, from this point lets take it a few steps way back; does it also apply to molecular machinery such as the bacterial flagellum (or put it another way, is it universal)? Does the algorithm only apply top-down where significant amount of functional information must be present before a selection criteria can emerge or bottom-up; in which case it actually selects new usable function through the Mendelian process of survival of the fittest relative to the niche (fitness function) and random mutations running parallel with regards to the genetic medium. As a programmer, I needed to see for myself what all the fuss is about, so within about 1 hours time got one rolling (apart from the minor add-ons). Not everyone knows how to interpret code (in this case c code), my task is to explain to the viewer or layman exactly whats going on behind the scenes, what it proves (if anything) and how it depicts reality.


#include<stdio.h>
#include<ctype.h>
#include<stdlib.h>
#include<math.h>

#define ALPHABET "ABCDEFGHIJKLMNOPQRSTUVWXYZ "
#define ALPHA_LENGTH 27

typedef enum { FALSE = 0, TRUE = 1 } boolean;

void upString(char *);
void mutate(char *, char *, int);
void shuffle(char *);
int randInt(int, int);
int fitness(char *);
boolean isLocked(char *);

char TARGET_STRING[100];
int TARGET_LENGTH;

int main(int argc, char **argv) {

 float rate = 0.0;
 FILE* file = (argv[1] == NULL) ? fopen("output.txt", "w") : fopen(argv[1], "w");

 if(file == NULL) {
    return 1;
 }

 fprintf(stdout,"\nEnter a target string:\t");gets(TARGET_STRING);

 fprintf(stdout,"\nEnter a mutation rate:\t");fscanf(stdin,"%f", &rate);

 TARGET_LENGTH = strlen(&TARGET_STRING[0]);

 float mutation_rate = ceil((rate / 100) * TARGET_LENGTH);int m_rate = (int)mutation_rate;

 upString(&TARGET_STRING[0]);

 char *current_ptr;
 char current[TARGET_LENGTH];

 current_ptr = &current[0];

 strcpy(&current_ptr[0], &TARGET_STRING[0]);

 shuffle(&current_ptr[0]);

 char *buffer_ptr;
 char buffer[TARGET_LENGTH];

 buffer_ptr = &buffer[0];

 int gen_count = 0;    char str_gc[100];
 int i;

 while(!isLocked(buffer_ptr)) {

 if (system("cls")) system("clear");

 mutate(current_ptr, buffer_ptr, m_rate);

 if(fitness(&buffer_ptr[0]) >= fitness(&current_ptr[0])) {
    strcpy(&current_ptr[0], &buffer_ptr[0]);
 }

 puts(current_ptr);
 fputs(current_ptr, file);
 fputs("\n", file);
 gen_count++;
 }

 fprintf(stdout,"\nNumber of trials to reach target: %d", gen_count);

 fputs("\n", file);
 fputs("\nNumber of trials to reach target: ", file);
 fputs(itoa(gen_count, str_gc, 10), file);

 fclose(file);

return 0;
}

void upString(char *string) {
 int i;
 for(i=0;i<TARGET_LENGTH; i++)
    string[i] = toupper(string[i]);
}

void mutate(char* cPtr, char* bPtr, int m_rate) {
 strcpy(&bPtr[0], &cPtr[0]);
 int i;
 for(i=0;i<m_rate;i++) {
    bPtr[randInt(0, (TARGET_LENGTH - 1))] = ALPHABET[randInt(0,(ALPHA_LENGTH - 1))];
 }
}

void shuffle(char *string) {
 int j, k;
 for(j=0; j<TARGET_LENGTH; j++) {
 for(k=0; k<TARGET_LENGTH; k++) {
   int r = randInt(0, TARGET_LENGTH - 1);
   int temp = string[k];

    string[k] = string[r];
    string[r] = temp;
 }
 }

}

int fitness(char *cString) {
 int i;int count = 0;
 for(i=0;i<TARGET_LENGTH;i++) {
    if(*(cString + i) == TARGET_STRING[i]) {
    count++;
 }
 }
return count;
}

int randInt(int min, int max) {
 static int kState = 0;int i;
 if(kState == 0) {
   srand(time(NULL));
    kState = 1;
 }
 i = (rand() % (max - min + 1) + min);

return i;
}

boolean isLocked(char *curPtr) {
 boolean state = FALSE;
 if(strcmp(&curPtr[0], &TARGET_STRING[0]) == 0) {
   state = TRUE;
 }
return state;
}

Example output on string METHINKS DAWKINS IS A WEASEL

T SEAASMENQKAIWLSW IIS HDNE
T SEAASMENQKAIWLSW  IS HDNE
T SEAASMENQKJIWLSW  IS HDNE
K SEAASMENQKJIWLSW  IS HDNE
S SEAASMENQKJIWLSW  IS HDNE
S SEAASMENQKJIWLSW IIS HDNE
S SEAASMENQKJIWLSW IISRHDNE
G SEAASMENQKJIWLSW IISRHDNE
………
METHINKS DASKINS IS A WEASEL
METHINKS DAJKINS IS A WEASEL
METHINKS DAJKINS IS A WEASEL
METHINKS DAJKINS IS A WEASEL
METHINKS DAWKINS IS A WEASEL

Number of trials to reach target: 3703

To try the program yourself you must first compile it using gcc compiler. When you create the executable type in the command line the executable followed by the user defined text file you want the output to be written.

program_exe output_file.txt

If you do not include a file name to output the result the program will generate one for you named output.txt. It will be located in the directory you executed the program in.

Before Dawkins publicly presented the algorithm, he made it explicit that it was targeted against the notion that evolution is completely random (apparently at the time this was the common “public” understanding or misunderstanding according to Dawkins and evolutionary biologists in general) . What Dawkins meant was that for Darwinian Evolution to occur it must have two necessary components in addition to survival of the fittest and random mutations: a reference and a buffer.

The reference in this case is the environment/niche itself. Given a set of x within a y provides a reference to that y, similarily to how a house/company equiped with a modem holds a reference to a public IP address. The  company in this case has a scope relative to the internet, similarily, an organism has a scope relative to the environment. Its a virtual reference that the population is kept in sync with, or not. Keeping in sync with the surrounding environment and its subsequent competition is what is going to either propagate the species or diminish it. This gets us to survival of the fittest. The environment indirectly specifies the target, as an adaptation function.

Survival of the fittest is nothing more than one “species” having the toolset to either win or lose, survive or not. If you watch nature documentary films, you will find direct competition between prey and predators, but what you won’t find as easily is the indirect competition which involves an organism being more or less suited to the environment as a whole. This includes but is not limited to the given resources, climate, etc…

Random mutations are of course, inherited changes/modifications to the genetic information as a result of combination or aggregation causally through reproduction and overtime through genetic drift where real results of evolution start to appear (supposedly). Reproductive success measures this ability for organisms to reproduce quantitatively.

The buffer in this case is the organism itself where modifications can be stored and re-transmitted to subsequent offspring for subsequent combination and buffering. When we discuss changes in Darwinian Evolutionary terms, we discuss this transmit -> store -> transmit -> store… cycle. We call it a buffer because the buffer is a temporary storage location, the organism lives and dies, but with the new organism taking its place  the existing buffer is over-written (so to speak, and with an extra added bonus of random mutations; neutral, harmful and maybe even beneficial in the sense that new working function is induced). So to propogate change we have a storage mechanism, something that holds the next state of change , a state of change where one state given some input changes to a new state.

As plausible as the mechanism may sound there are significant problems/issues with all of this (and requires some careful attention to details). If you notice the transmit -> store cycle, we start with a transmit before anything else. This would indicate we start with some important pre-existing potential rather than none at all. What we find is that the algorithm requires a significant degree of functional information before a selection criteria can even take place, after all natural selection can only work if there is something to select from. The functional information required would be similar to booting a minimal PC and OS (this includes RAM /temp storage, permanent storage such as Flash memory, a CPU – basically the whole Von Neumann fetch-decode and execute architecture and a whole lot of low-level software governing its operations) and getting that to evolve. But in fact, this is self-replication, and it is an incredibly difficult task even for an intelligent designer. When we speak of self-replication we are discussing having a copy of it’s own instructions plus the necessary assembly line to put together.

Another problem is the target, this is yet another added bonus injected by the programmer in advance, and Dawkins admits it. Dawkins admits that the program does not model reality correctly, since in reality there is no target. There is only the environment, survival of the fittest and random mutations, the target in this case is simply to survive. If x survive’s, x reaches a temporary target y best suited and well adapted for, and the process continues: mutate -> transmit -> store -> mutate -> transmit -> store… I added the mutate function here so that it becomes more clear; mutate requires information to act on, it doesn’t pop into existence. This is significant because this applies to every single program intended to simulate Darwinian Evolution, stuff just doesn’t pop out of thin air, you, the clever programmer have done your share of planning. Whether it has a target or not, you know exactly what the program is doing, and what any results (temporary without a target or fixed given the target) are representing. Without at least that knowledge of knowing what the results represent the algorithm collapses. The computer has no clue what the algorithm is doing, it just fetches and executes instructions, it does what you “tell” it to do. So, given that what does the environmental reference point have to do with producing new function and keeping track of it? What it comes down to is this: is it enough to say that given random mutations and natural selection, these two primary forces will inevitably create and sustain all the functional specified complexity in biology? Don’t count on it!

continued later…