Some of the oldest words in English have been identified, scientists say. Reading University researchers claim "I", "we", "two" and "three" are among the most ancient, dating back tens of thousands of years.
Their computer model analyses the rate of change of words in English and the languages that share a common heritage. The team says it can predict which words are likely to become extinct - citing "squeeze", "guts", "stick" and "bad" as probable first casualties.
"We use a computer to fit a range of models that tell us how rapidly these words evolve," said Mark Pagel, an evolutionary biologist at the University of Reading. "We fit a wide range, so there's a lot of computation involved; and that range then brackets what the true answer is and we can estimate the rates at which these things are replaced through time." Sound and concept Across the Indo-European languages - which include most of the languages spoken from Europe to the Asian subcontinent - the vocal sound made to express a given concept can be similar. New words for a concept can arise in a given language, utilising different sounds, in turn giving a clue to a word's relative age in the language.
At the root of the Reading University effort is a lexicon of 200 words that is not specific to culture or technology, and is therefore likely to represent concepts that have not changed across nations or millennia.
"We have lists of words that linguists have produced for us that tell us if two words in related languages actually derive from a common ancestral word," said Professor Pagel. When we speak to each other we're playing this massive game of Chinese whispers Mark Pagel, University of Reading "We have descriptions of the ways we think words change and their ability to change into other words, and those descriptions can be turned into a mathematical language," he added.
The researchers used the university's IBM supercomputer to track the known relations between words, in order to develop estimates of how long ago a given ancestral word diverged in two different languages. They have integrated that into an algorithm that will produce a list of words relevant to a given date.
"You type in a date in the past or in the future and it will give you a list of words that would have changed going back in time or will change going into the future," Professor Pagel told BBC News. "From that list you can derive a phrasebook of words you could use if you tried to show up and talk to, for example, William the Conqueror."
That is, the model provides a list of words that are unlikely to have changed from their common ancestral root by the time of William the Conqueror. Words that have not diverged since then would comprise similar sounds to their modern descendants, whose meanings would therefore probably be recognisable on sound alone. However, the model cannot offer a guess as to what the ancestral words were. It can only estimate the likelihood that the sound from a modern English word might make some sense if called out during the Battle of Hastings. Dirty business.
What the researchers found was that the frequency with which a word is used relates to how slowly it changes through time, so that the most common words tend to be the oldest ones. For example, the words "I" and "who" are among the oldest, along with the words "two", "three", and "five". The word "one" is only slightly younger.
William the Conqueror (Getty) Time-travellers would find a few sounds familiar in William's words The word "four" experienced a linguistic evolutionary leap that makes it significantly younger in English and different from other Indo-European languages. Meanwhile, the fastest-changing words are projected to die out and be replaced by other words much sooner.
For example, "dirty" is a rapidly changing word; currently there are 46 different ways of saying it in the Indo-European languages, all words that are unrelated to each other. As a result, it is likely to die out soon in English, along with "stick" and "guts". Verbs also tend to change quite quickly, so "push", "turn", "wipe" and "stab" appear to be heading for the lexicographer's chopping block.
Again, the model cannot predict what words may change to; those linguistic changes are according to Professor Pagel "anybody's guess".
High fidelity "We think some of these words are as ancient as 40,000 years old. The sound used to make those words would have been used by all speakers of the Indo-European languages throughout history," Professor Pagel said. "Here's a sound that has been connected to a meaning - and it's a mostly arbitrary connection - yet that sound has persisted for those tens of thousands of years."
The work casts an interesting light on the connection between concepts and language in the human brain, and provides an insight into the evolution of a dynamic set of words. "If you've ever played 'Chinese whispers', what comes out the end is usually gibberish, and more or less when we speak to each other we're playing this massive game of Chinese whispers. Yet our language can somehow retain its fidelity.