Category Archives: Linguistics

Turing tests in Chinese rooms: What does it mean for AI to outperform humans

Share

TLDR;

  • Reports that AI beat humans on certain benchmarks or very specialised tasks don’t mean that AI is actually better at those tasks than any individual human.
  • They certainly don’t mean that AI is approaching the task with any of the same understanding of the world people do.
  • People actually perform 100% on the tasks when administered individually under ideal conditions (no distraction, typical cognitive development, enough time, etc.) They will start making errors only if we give them too many tasks in too short a time.
  • This means that just adding more of these results will NOT cumulatively approach general human cognition.
  • But it may mean that AI can replace people on certain tasks that were previously mistakenly thought to require general human intelligence.
  • All tests of artificial intelligence suffer from Goodart’s law.
  • A test more closely resembling an internship or an apprenticeship than a gameshow may be a more effective version of the Imitation Game.
  • Worries about ‘superintelligence’ are very likely to be irrelevant because they are based on an unproven notion of arbitrary scalability of intelligence and ignore limits on computability.

Reports of my intelligence have been greatly exaggerated

Over the last few years, there have been various pronouncements about AI being better than humans at various tasks such as image recognition, speech transcription, or even translation. And that’s not even taking into account bogus winners of the Turing test challenge. To make things worse, there’s always the implication that this is means machine learning is getting closer to human learning and artificial intelligence is only a step away from going general.

All of those reports were false. Every. Single. One. How do we know this? Well, because none of them were followed by “and therefore we have decided to replace all humans doing job X with machine learning algorithms”. But even if this were the case, it still would not necessarily mean that the algorithm outperformers humans at the task. Just that it can outperform them at the task when it is repeated time after time and the algorithm ends up making fewer mistakes because, unlike people, it does not get tired, distracted, or simply ticks the wrong box.

But even if the aggregate number of errors is lower for a machine learning algorithm, it may still not make sense to use it because it makes qualitatively different errors. Errors that are more random and unpredictable are worse than more systematic errors that can be corrected for. Also, because AI has no metacognitive mechanisms to identify its errors by doing a ‘sense check’. This often makes correcting AI-generated transcripts difficult to correct because it makes errors that don’t make intuitive sense.

Pattern matching in radiology and law

The closest machine learning has gotten to outperforming humans doing real jobs is in radiology. (I’m discounting games like Go, here.) But even here it only equalled the performance of the best experts. However, this could easily be enough. But interpreting X-Rays is an extremely specialised task that requires lots of training and has a built-in error rate. It is a pattern recognition exercise, not a general reasoning exercise. All the general reasoning about the results of the X Rays still has to be delegated to the human physician.

In a similar instance, AI could notice inconsistencies in complex contracts better than lawyers. Again, this is very plausible, but again this was a pattern-matching exercise with a machine pitted against human distractability and stamina. Definitely impressive, useful, and not something expected even a few years ago. But not in any meaningful ways replacing the lawyer any more than a form to draw up a contract I downloaded from the internet does.

This is definitely a case where an AI can significantly augment what an unassisted human can do. And while it will not replace radiologists or lawyers as a category, it could certainly greatly decrease their numbers.

Machine learning to the test

So on very specialised tasks involving complex pattern recognition, we could say that AI can genuinely outperform humans.

But in all the instances involving language and reasoning tasks, even if an AI beats humans on a test, it does not actually ‘outperform’ them on the task. That’s because tests are always imperfect proxies for the competence they measure.

For example, native speakers often don’t get 100% on English proficiency tests and can even do worse than non-native speakers in certain contexts. Why? Three reasons: 1. They can imagine contexts not expected of non-native speakers. 2. The non-native speakers have been practicing taking these tests a lot so they make fewer formal mistakes.

We are facing exactly the same problems when comparing machine learning and human performance based on tests designed to evaluate machine learning. Humans are the native speakers and they perform 100% on all the tasks in their daily lives. But their performance seems less than perfect in test conditions.

BLEU and overblown claims about Machine Translation

Sometimes the problem is with a poorly designed test. This is the case with the common measure of machine translation called BLEU (Bi-Lingual Evaluation Understudy). BLEU essentially measures how many similar words or word pairs there are in the translation by machine when compared to a reference corpus of human translations. It is obvious that this is not a good metric of quality of translation. It can easily assign a lower score to a good translation and a high score to a patently bad one. For instance, it would not notice that the translation missed a ‘not’ and gave the opposite meaning.

What human translators do is translate whole texts NOT sentences. This sometimes means they drop things, add things, rearrange things. This involves a lot of judgment and therefore no two translations are ever the same. And outside trivial cases they’re never perfect. But a reliable translator can make sure they convey the key message and they could provide footnotes to explain where this was not possible. Machine learning can get surprisingly good at translating texts by brute force. But it is NOT reliable because it operates with no underlying understanding of the overall meaning of the text.

That’s why we can easily dismiss Microsoft’s claim that their English-to-Chinese interpreter outperformed human translators. That is only because they used the BLEU metric to make this claim rather than professional translators evaluating the quality of AI output against that of other professional translators on any test. And since Microsoft has yet to announce that it is no longer using human interpreters when its executives visit China, we can safely assume that this ‘outperform’ is not real.

Now, could a machine translation ever get good enough to replace human translators? Possibly. But it is still very far from that for texts of any complexity. Transformers are very promising at improving the quality of the translation but they still only match patterns. To translate you need to make quite rich inferences and we’re nowhere near this.

GLUE and machine understanding come unstuck

Speaking of inferences. How good is AI at making those? Awful. Here we have another metric to look at: GLUE! Unlike BLEU which is a really bad representation of the quality of translation, GLUE (General Language Understanding Evaluation) is a really good representation of human intelligence. If you wanted to know what are the components of human intelligence, you could do a lot worse than look at the GLUE test.

But the GLUE leaderboard has a human benchmark and it comes 4th with 87.1% score. This puts it 1.4% behind the leader which is Facebook at 88.5%. So, it’s done. AI has not only reached human level of reasoning, it has surpassed them! Of course, not. Apart from the fact that we don’t know how much of a difference in reasoning ability 1% is, this tells us nothing about human ability to reason when compared to that of a machine learning model. Here’s why.

How people and machines make errors

I would argue that a successful machine learning algorithm does not actually outperform humans on these tasks even if it got 100%. Because humans also get 100% but they also devised the test.

Isn’t this a contradiction? How can humans get 100% if they consistently score in the mid-80s when given the test. Well, humans designed the test and the correctness criteria. And a machine learning algorithm must match the best human on every single answer to equal them. The benchmark here is just an average of many people over many answers and does not just reflect the human ability to reason but also the human ability to take tests.

Let’s explain by comparing what it means when a human makes an error on a test and when a machine does. There are three sources of human error: 1. Erroneous choice when knowing the right answer (ie clicking a when meaning to click b), 2. Lack of attention (ie choosing a because we didn’t spend enough time reading the task to choose correctly), 3. Overinterpretation (providing context in our head that makes the incorrect answer make sense).

These benchmarks are not Mensa tests, they measure what all people with typical linguistic and cognitive development can do. Let’s take the Windograd Schema test as an example. Here’s an often-quoted example:

The trophy didn’t fit into the suitcase because itwas too big.
The trophy didn’t fit into the suitcase because itwas too small.

It is very possible that out of 100 people, 5 would get this wrong because they click the wrong answer, 10 because they didn’t process the sentence structure correctly and 1 because they constructed a scenario in their head in which it is normal for suitcases to be smaller than the thing in them (as in Terry Pratchett’s books).

But not a single one got it wrong because they thought that a thing can be bigger than the thing it fits in.

Now, when a machine learning model gets it wrong, it does it because it miscalculated a probability based on an opaque feature set it constructs from lots of examples. When you get 2 people together, they can always figure out the right answer and discuss why they did it wrong. No machine learning algorithm can do that.

This becomes even more obvious when we take an example from the actual GLUE benchmark:

Maude and Dora had seen the trains rushing across the prairie, with long, rolling puffs of black smoke streaming back from the engine. Their roars and their wild, clear whistles could be heard from far away. Horses ran away when they came in sight.

So what does the ‘they’ refer to here? The obvious candidate here is ‘trains’. But it is easy to imagine that a person could click the option where ‘puffs of black smoke’ or even ‘Maude and Dora’ are the antecedent. That’s because both of those can be ‘seen’ and could theoretically cause horses to run away. If this is the 10th sentence I’m parsing in a go, I may easily shortcut the rather complex syntactic processing. I can even see someone choosing “whistles” even though they cannot “come in sight” but are a very strong candidate for causing horses to run away. But nobody would choose ‘horses’ unless they misclicked. A machine learning algorithm very easily could do this simply because ‘they’ and ‘horses’ match grammatically.

But all of this is actually irrelevant, because of how the ML algorithms are tested. They are given multiple pairs or sentences and asked to say 1 or 0 on whether they match or not. So some candidate sentences above are “Horses ran away when the trains came in sight.”, “Horses ran away when Maude and Dora came in sight.” or “Horses ran away when the whistles came in sight.” What it does NOT do is ask “Which of the words in the sentence does ‘they’ refer to?” Because the ML model has no understanding of such questions. You would have to train it for that task separately or just write a sequential algorithm to process these questions.

What people running these contests also cannot do is ask the model to explain their choice in a way that would show some understanding. There is a lot of work being done on interpretability, but this just spits out a bunch of parameters that have to be interpreted by people. Game, set and match to humans.

Chinese room revisited

But let’s also think about what it means for a neural network model to get things right. This brings us back to Searl’s famous Chinese room argument. Every single choice a model makes has assigned a probability and even quite ridiculous choices have a non-zero chance of being right in the model. Let’s look at another common example:

The animal didn’t cross the road because it was too busy.

Here it is sensible to assign it to ‘road’ because it makes the most sense but one could imagine a context in which we could make it refer to ‘the animal’. Animals can be thought of as busy and we can imagine that this could be a reason for not crossing the road. But we know with 100% certainty that it does not refer to ‘the’ or even ‘cross’. Yet, a neural model has no such assurance. It may never choose ‘the’ in practice as the antecedent for ‘it’ but it will never completely discount it, either.

So, even if the model got everything right. We could hardly think of it as making human-like inferences unless it could label certain antecedents as having 0% probability and others (much rarer) as having 100%. (Note: Programming it to change 10% to 0% or 90% to 100% does not count.)

This feels like a very practical expression of Searl’s Chinese room argument albeit in a weak form. Neural networks pose a challenge to Searl because their algorithmic guts are not as exposed as those of the expert systems of Searl’s time. But we can still see echoes of their lack of actual human-like reasoning in their scores.

Is a test of artificial intelligence possible under Goodhart’s Law?

I once attended a conference on AI risk where a skeptic said he wasn’t going to worry “until an AI could do Winograd schemas”. This referred to a test of common sense and linguistic ambiguity that AIs have long been famously bad at. NowMicrosoft claims to have developed a new AI that is comparable to humans on this measure. (Scott Alexander)

This post was inspired by the above remark by Scott Alexander. I wanted to explain why even the Winograd challenge being conquered is not enough in and of itself.

AI proponents constantly complain of sceptics’ shifting standards. When AI achieves a benchmark, everybody scrambles to find something else that could be required of it before it gets a pass. And I admit that I may have made a claim similar to that of the AI researcher quoted by Scott Alexander when I was writing about the Winograd schemas.

But the problem here is not that machines became intelligent and everybody is scrambling to deny the reality. The problem is that they got better at passing the test in ways that nobody envisioned when the test was designed. All this while taking no steps towards actual intelligence. Although with a possible increase in practical utility.

This is the essence of Goodhart’s Law: “When a measure becomes a target, it ceases to be a good measure.” The Winograd Schema Challenge seemed so perfect. Yet, I can imagine a machine learning getting good at passing the challenge but still not actually having any of the cognition necessary to really deal with the tasks in real life. In the same way that IBM Watson got really good at Jeopardy but failed at everything else.

None of this is to say that machine learning could not get good enough at performing many tasks that were previously thought to require generalised cognitive capacity. But when machines actually achieve human-level artificial intelligence, we will know. It will not be that hard to tell. But it will not likely happen just because we’re doing more of the same.

The problem with the Turing test or imitation game is not that it cannot produce reliable results on any one run of it. The problem is that if any single test becomes not only the measure but also a target, it is very much possible to focus on passing the test on the surface while bypassing the underlying abilities the test is meant to measure. But the problem is not just with the individual tests but rather in the illusion that we can design a test that will determine AGI level performance simply by reaching an arbitrary threshold.

The current Turing test winners won by misdirection that hid the fact that they refused to answer the questions. This could be fixed by requiring that Grice’s cooperative principle maxims are observed (especially quality and relevance) but even then, I could see a system trained to deal with a single time-bound conversation pass without any underlying intelligence.

As Scott Aaronson showed, it is possible to defeat a current level AI system simply by asking ‘What is bigger a shoebox or Mount Everest’. But once a pattern of questioning becomes known, it becomes a target and therefore a bad measure.

Similar things happen with all standardised aptitude tests designed so that they cannot be studied for. Job interview techniques designed to get interviewees to reveal their inner strengths and weaknesses. All of these immediately spawn industries of prep schools, instructional guides, etc. That makes them less useful over time (assuming they were all that useful to start with).

Towards a test by Critical Turing Internship

That’s why the Turing test cannot be a ‘test’ in the traditional sense. At the very least, it cannot be a single test.

History and a lot of human-computer interaction research has also shown that people are very bad at administering the Turing test (or playing the imitation game). But this is paradoxically because they’re very good the very thing the machines have been failing at: meaning making. Because we almost never encounter meaningless symbols but often encounter incomplete ones, we are conditioned to always infer some sort of meaning from any communication. And it is difficult if not impossible to turn it off.

Every time we see a bit of language we automatically imbue it with some meaning. So, any Turing tester must not only be trained in the principles of cognition but also to discard their own linguistic instincts. We don’t know what it will take for a machine to become truly intelligent but we do know that humans are notoriously bad at telling machines apart from other humans. We simply cannot entrust this sort of thing to such feeble foundations.

As I said above, I suspect that by the time machines do achieve human-level performance on these tasks, it will be obvious. We probably won’t need such a test. Assuming we get there which is not a given. But if a test were needed, it could look something like this.

To replace the Turing test, I would like to propose a sort of Turing Internship. We don’t entrust critical tasks in fields like medicine to people who just passed a test but require they prove ourselves in a closely supervised context. In the same way, we should not trust any AI system based on a benchmark.

Any proposed human-level AI system can be placed in multiple real contexts with several well-informed human supervisors who would monitor its performance for a period of weeks or months to allow for any tricks to be exposed. For example, most people after a few weeks with Alexa, Google Assistant or Siri, get a clear picture of its strengths and limitations. Five minutes with Alexa may make you feel like the singularity is here. Five months will firmly convince you that it is nowhere in sight.

But at the moment, we don’t need this. We don’t need months or weeks to evaluate AI for human-level intelligence. We need minutes. I estimate that we would not need to use this kind of AI internship for another 50 years but likely for much much longer. We are too obssessed with the rapid progress of some basic technologies but ignore many examples of stagnation. My favourite here is the Roomba which has been on the market for 17 years now and has hardly progressed at all. Equally, the current NLP technologies have made massive strides in utility but have not progressed towards anything that could be meaningfully described as understanding.

That is not to say that tests like GLUE or even BLUE are completely useless. They can certainly help us compare ML approaches (up to a point). They’re just useless for comparing human performance with those of machine-generated models.

Note on Nick Bostrom and Superintelligence

One obvious objection to the Turing Internship idea is that if human-level AI is the last step before Bostrom’s ‘Superintelligence’, unleashing it in any real context would be extremely dangerous.

If you believe in this ‘demon in the machine’ option, there’s nothing I can do to convince you. But I personally don’t find Superintelligence in any way persuasive. The reason is that most of the scenarios described are computationally infeasible in the first place. Bostrom does not mention the issue of computability and things like P=NP almost at all. And he completely ignores questions of nonlinear complexity.

It is hard to judge whether a ‘superintelligent’ system could take over the world. But could it predict the weather 20 days out with 1% tolerance of temperature estimates in any location? The answer is most likely not. There may not be enough atoms in the universe to compute the weather arbitrarily precisely more than a few days in advance. Could it predict earthquakes? Could it run an economy more efficiently than an open market relying on price signals? The answers to all those questions are most likely no. Not because the superintelligence is not super enough but because these may not be problems that can be solved by adding ‘more’ intelligence. Assuming that ‘intelligence’ is a linearly scalable property in the first place. It may well be like body size, after a certain amount of increase, it would just collapse onto itself.

Superintelligence requires a conspiracy theorist’s mindset. Not that people who believe are conspiracy theorists. But they assume that complexity can be conquered with intelligence. They don’t believe that humans are ‘smart’ enough to control everything. But they believe that it is inherently possible. Everything we know about complexity, suggests that this is not the case. And that is why I’m not worried.

Fruit loops and metaphors: Metaphors are not about explaining the abstract through concrete but about the dynamic process of negotiated sensemaking

Share

Note: This is a slightly edited version of a post that first appeared on Medium. It elaborates and exemplifies examples I gave in the more recent posts on metaphor and explanation and understanding.

One of the less fortunate consequences of the popularity of the conceptual metaphor paradigm (which is also the one I by and large work with on this blog) is the highlighting of the embodied metaphor at the expenses of others. This gives the impression that metaphors are there to explain more abstract concepts in terms of more concrete ones.

Wikipedia: “Conceptual metaphors typically employ a more abstract concept as target and a more concrete or physical concept as their source. For instance, metaphors such as ‘the days [the more abstract or target concept] ahead’ or ‘giving my time’ rely on more concrete concepts, thus expressing time as a path into physical space, or as a substance that can be handled and offered as a gift.“

And it is true that many of the more interesting conceptual metaphors that help us frame the fundamentals of language are projections from a concrete domain to one that we think of as more abstract. We talk about time in terms of space, emotions in terms of heat, thoughts in terms of objects, conversations as physical interactions, etc. We can even deploy this aspect of metaphor in a generative way, for instance when we think of electrons as a crowd of little particles.

But I have come to view this as a very unhelpful perspective on what metaphor is and how it works. Instead, going back to Lakoff’s formulation in Women, Fire, and Dangerous Things, I’d like to propose we think of a metaphor as a principle that helps us give structure to our mental models (or frames). But unlike Lakoff, I like to think of these as an incredibly dynamic and negotiated process rather than as a static part of our mental inventory. And I like to use conceptual intergation or blending as way of thinking about the underlying cogntivive processes.

Metaphor does two things: 1. It helps us (re)structure one conceptual domain by projecting another conceptual domain onto it and 2. In the process of 1, it creates a new conceptual domain that is a blend of the two source domains.

We do not really understand one domain in terms of another through metaphor. We ‘understand’ both domains in different ways. And this helps us create new perspectives which are themselves conceptual domains that can be projected or projected into. (As described by Fauconnier and Turner in The Way We Think).

This makes sense when we look at some of the conventional examples used to illustrate metaphors. “The man is a lion” does not help us understand lesser known or more abstract ‘man’ by using the better known or more concrete ‘lion’. No, we actually know a lot more about men and the specific man we’re thus describing than we do about lions. We are just projecting the domain of ‘lions’ including the conventionalised schemas of bravery and fierceness onto a particular man.

This perspective depends on our conventionalised way of projecting these 2 domains. Comparison between languages illustrates this further. The Czech framing of lions is essentially the same as English but the projection into people also maps lion’s vigour into work to mean ‘hard working’. So you can say “she works as a lion”, meaning she works hard. But in the age of documentaries about lions, a joke subverting the conventionalised mapping also appeared and people sometimes say “I work like a lion. I roar and go take a nap.” This is something that could only emerge as more became conventionally known about lions.

But even more embodied metaphors do not always go in a predictable direction. We often structure affective states in terms of the physical world or bodily states. We talk about ‘being in love’ or ‘love hitting a rocky patch’ or ‘breaking hearts’ (where metonymy also plays a role). But does that really mean that we somehow know less about love than we know about travelling on roads? Love is conventionally seen as less concrete than roads or hearts but here we allow ourselves to be mislead by traditional terminology. The domain of ‘love’ is richly structured and does not ‘feel’ all that abstract to the participants. (I’d prefer to think of ‘love’ as a non-prototypical noun; more prototypical than ‘rationalisation’ but less prototypical than ‘cat’).

Which is why ‘love’ can also be used as the source domain. We can say things like “The camera loves him.” and it is clear what we mean by it. We can talk about physical things “being in harmony” with each other and thus helping us understand them in different ways despite harmony being supposedly more abstract than the things being in harmony.

The conceptual domains that enter into metaphoric relationships are incredibly rich and multifaceted (nothing like the dictionaries or encyclopedias we often model linguistic meaning after). And the most important point of unlikeness is their dynamic nature. They are constantly adapting to the context of the listeners and speakers, never exactly the same from use to use. We have a rich inventory of them at our disposal but by reaching into it, we are also constantly remaking it.

We assume that the words we use have some meanings but it is us who has the meanings. The words and other structures just carry the triggers we use to create meanings in the process of negotiation with the world and our interlocutors.

But this sounds much more mysterious and ineffable than it actually is. These things are completely mundane and they are happening every time we open our mouths or our minds. Here’s a very simple but nevertheless illuminating illustration of the process.

Not too long ago, there were two TV shows that had some premise similarities (Psych and The Mentalist). One of them came out a year earlier and its creators were feeling like their premise was copied by the other one. And they used the following analogy:

“When you go to the cereal aisle in a grocery store, and you see Fruit Loops there. If you look down on the bottom, there’s something that looks just like Fruit Loops, and it’s in a different bag, and it’s called Fruity Loop-Os.” 

I was watching both shows at the time but their similarity did not jump out at me. But as soon as I read that comparison it was immediately clear to me what the speaker was trying to say. I could automatically see the projection between the two domains. But even though it seemed the cereal domain was more specific, it actually brought a lot more with it than the specificity of cereal boxes and their placement on store shelves. What it brought over was the abstract relationship between them in quality and value but also many cultural scripts and bits of propositional knowledge associated with cereal brands and their copycats.

But there was even more to it than that. The metaphor does not stop at its first outing (it’s kind of like mushrooms and their  in this way). Whenever, I see a powerful analogy or generative metaphor on the internet, I always look for the comments where people try to reframe it and create new meanings. Something I have been calling ‘frame negotiation’. Take almost any salient metaphoric domain projection and you will find that it is only a part in a process of negotiated sense making. This goes beyond people simply disagreeing with each other’s metaphors. It includes the marshalling of complex structuring conceptual phenomena from schemas, rich images, scenarios, scripts, to propositions, definitions, taxonomies and conventionalised collocations.

This blog post and its comments contain almost all of them: . First, the post author spends three paragraphs (from third on), comparing the two shows and finding similarities and differences. This may not seem like anything interesting but it reveals that the conceptual blends compressed in the cereal analogy are completely available and can be discussed as if it was a literal statement of fact.

Next, the commenters, who have much less space, return to debating the proposition by recompressing it into more metaphors. These are the first four comments in full:

  1. Anonymous said… They’re not totally different. It’s more like comparing Fruit Loops to Fruit Squares which happen to taste like beef.
  2.  said… I think a better comparison would Corn Flakes and Frosted Flakes. Both are made with the same cereal, but one’s sweeter (Psych).
  3.  said… Sweeter as in more comedy oriented? They are vastly different shows that are different on so many levels.
  4. Anonymous said… nikki could not be more right with the corn flakes and frosties analogy

Here we see the process of sense making in action. The metaphoric projection is used as one of several structuring devices around which frames are made. Comment 1 opens the the process by bringing in the idea of reframing through other analogs in the cereal domain. 2. continues that process by offering an alternative. 3. challenges the very idea of using these two domains and 4. agrees with 2 as if this were a literal statement but also referring to the metalinguistic tool being used.

The subsequent comments return to comparing the two shows . Some by offering propositions and scenarios, others by marshalling a new analogy.

 said… The reason the Mentalist feels like House is because house is a modern day medical version of Homes as in Holmes Sherlock. Also both Psych and The Mentalist are both Holmsian in creation. That being said I love the wit and humor of psych

Again, there is no evidence of the concrete/abstract duality or even one between less and better known domains. It is all about making sense of the domains in both cognitive and affective ways. Some domains have very shallow projections (partial mappings) such as cornflakes and frosty flakes, others have very deep mappings such as Sherlock Holmes. They are not providing new information or insight in the way we traditionally think of them. Nor are they providing an explanation to the uninitiated. They are giving new structure to the existing knowledge and thus recreating what is known.

The reason I picked such a seemingly mundane example is because all of this is mundane and it’s all part of the same process. One of my disagreements with much of metaphor application is the overlooking of the ‘boring’ bits surrounding the first time a metaphor is used. But metaphors are always a part of a complex textual and discursive patterns and while they are not parasitic on the literal as was the traditional slight against them, they are also not the only thing that goes on when people make sense.

Writing as translation and translation as commitment: Why is (academic) writing so hard?

Share

This book will perhaps only be understood by those who have themselves already thought the thoughts which are expressed in it—or similar thoughts. It is therefore not a text-book. Its object would be attained if there were one person who read it with understanding and to whom it afforded pleasure.
(opening sentence of the preface to Tractatus Logico Philosophicus by Lugwig Wittgenstein, 1918)

Background

I’ve recently been commenting quite a lot on the excellent academic writing blog (which I mostly read for the epistemology) Inframethodology by Thomas Basbøll. Thomas and I disagree on a lot of details but we have a very similar approach to formulating questions about knowledge and its expression.

The recent discussion was around the problem of ‘writing as expressing what you know’. While I find it very useful to distinguish between writing to describe what you know and writing to explore and discover new ideas (something I first reflected on after reading Inframethodology), I commented:

I still find that no matter how well I think I know my subject, I discover new things by trying to write it down (at least with anything worth writing).

Thomas responded in a separate blogpost, first picking up on my parenthetical:

Can it really be true that the straightforward representation of a known fact is not “worth writing”? Is the value of writing always to be discovered (by way of discovering something new in the moment of writing)? I think Dominik is thinking of kinds of writing that are indeed very valuable because they present ideas that move our own thinking forward and, ideally, contribute positively to the thinking of our peers. But I also think there is value is writing that doesn’t do this, writing that is, for lack of a better word, boring.

With this, I agree wholeheartedly. 110% coach! Yes, this was a throwaway line I wasn’t comfortable with even as I was writing it. The majority of my writing is mundane: emails, instruction manuals, project proposals, etc. They may or may not be “worthy” but they certainly have a worth. And people who do nothing but that sort of writing certainly do not do anything I would find ‘beneath me’ or not worthy. I might have been better served by the term ‘quotidian’ or even ‘instrumental’ writing.

I agree even more with Thomas’s elaboration (my emphasis):

In fact, I think it’s the primary of value of academic writing and one of the reasons that so many people (and even academics themselves) almost equate “academic” (adj.) with “boring”.The business of scholarship is not to bring new ideas into the world, indeed, the function of distinctively academic work (in contrast to, say, scientific or philosophical or literary work) is not to innovate or discover but to critique, to expose ideas to criticism. In order for this happen efficiently and regularly, academics must spend some of their time representing ideas that are not especially exciting to them along with their grounds for entertaining them. They must present their beliefs to their peers along with their justification for thinking they’re true. And they must do this honestly, which is to say, they must not invent new beliefs or new reasons for holding them in the moment of writing. They must write down, not what they’re thinking right now, but what they’ve been thinking all along.

I find this an incredibly valuable perspective and when I think of my own writing, I think this is precisely where I’ve often been going wrong. This is partly because academic writing is more of a hobby than a job, so I don’t have the time to do more than write to discover. But it is partly because of my temperament. I don’t enjoy the boring duties of writing things I know down and then formatting them for the submission to a journal. I prefer to work with editors which is why the bulk of my published writing is in journalism or book chapters.

But there is still another aspect that needs to be explored. And that is, why do most people find it so difficult to write down what they know even while taking into account all of the above.

Writing as translation

I propose that a good way to think about the difficulty of writing to describe our thoughts is to use the metaphor of translation. We can then think of the content of our thoughts in our head as a series of propositions expressed in some kind of ‘mentalese’. And when we come to write them down, we are essentially translating them into ‘writtenese’ or in this case, one of its dialects ‘academic writtenese’.

This is made more complicated by the existence of a third language – let’s call it ‘spokenese’. We are all natively bilingual in ‘mentalese’ and ‘spokenese’ even if not everybody is very good at translating between these two languages. In fact, children find it very difficult until quite late ages (10 and up) to coherently express what they think and even many adults never achieve great facility with this. Just like many natively bilingual speakers are not very good at translating between their two languages.

But nobody is a native speaker of ‘writtenese’. Everybody had to learn it in school with all its weird conventions and specific processing requirements. It is not too outlandish to say (and I owe this to the linguist Jim Miller) that writing is like a foreign language. (Note: see some important qualifications below).

When we are translating from mentalese to academic writtenese, we are facing many of the same problems translators of very different languages faces. The one I want to focus on is ‘making commitments’.

Translation as commitment: Making the implicit explicit

Perhaps the most difficult problem for a translator (I speak as someone who has translated hundreds of thousands of words) is the issue of being forced by the way the target language operates to commit to meanings in the translation where the structure of the source language left more options for interpretation.

Let’s take a simple paragraph consisting of three sentences (Note: this is a paraphrase of an example given by Czech-Finnish translator at a conference I attended some years ago):

The prime minister committed to pursue a dialogue with the opposition. This was after the opposition leader complained about not being involved. She confirmed that he would have a seat at the table in the upcoming negotiations.

The first commitments I have to make at some point is to the gender of the participants in the actions I write about. In English, I can leave the gender ambiguous until the third sentence. In Finnish, which does not have gendered third-person-singular pronouns, I don’t have to express the gender at all.

In Czech (and many other languages), on the other hand, I have to know the gender of the prime minister from the very first word. Like actor and actress in English, all nouns describing professions have built-in genders (this is not optional as in English because all Czech nouns have assigned some grammatical gender). I also need to express gender as part of the past tense morphology of all verbs. So even if I could skirt the gender of the ‘leader’ (there are some gender-ambiguous nouns in Czech), I would have to immediately commit to it with the verb ‘complained’. Which is why knowledge of their subject is essential to simultaneous translators.

But this is a relatively simple problem that can be solved by reference to known facts about the world. A much more significant issue is the differential completion of certain schemas associated with types of expressions. Let’s take the phrase ‘committed to pursue’. The closest translation to the word ‘commit’ is ‘zavázat se’ which unfortunately has the root ‘bind’. It is therefore ever so slightly more ‘binding’ than ‘commit’. I can also look into something like ‘promise’ which of course is precisely what the prime minister did not do.

Then, there is the word ‘pursue’. One way to translate it is ‘usilovat o’ which has connotations of ‘struggle to’. So ‘usilovat o dialog’ is in the neighborhood of ‘pursue a dialog’ but lacks the sense of forward motion making it seem slightly less like the dialog is going to happen. So here each language is making subtly different commitments.

When you’re translating academic writing, there are hundreds of similar examples, where you have to fill in blanks and make some claims seem stronger and others weaker. And even if you know the subject intimately (which I did in most cases), you often have to insert your judgement and interpretation. And the more you do that, the less certain you feel that you got the meaning of the original exactly right. This is even when while reading the original, I had no sense of something being left unexpressed. The only way to get this right is to ask the author. But even that may not always work because they may not remember their exact mental disposition at the time of writing.

Writing as filling in holes in our mind

I believe that this is exactly the experience we have when we write about something that only exists in our head or something we’ve only previously talked about. Even when I’ve given talks at conferences and had many conversations with colleagues, writing my ideas down remains a difficult task.

When writing, the structure of ‘writtenese’ (as well as the demands of its particular medium) forces me to make certain commitments I never had to make in ‘mentalese’ (or even ‘spokenese’). I have to fill out schemas with detail that never seemed necessary. I have to make more commitments to the linearity of arguments, that could previously run parallel in my head. So when I write it is not clear what should come first and what last.

When I just write down what’s in my head (or as close to it as it is possible), it is unlikely to make any sense to anybody. Often including myself after some time. I need to translate it in such a way that all the necessary background is filled out. I also need to use the instruments of cohesion to restore coherence to the written text that I felt in my mind without any formal mental structure.

But during this process, I often become less certain. The act of writing things down triggers other associations and all of a sudden I literally see things from a different perspective. And this is often not a comfortable experience. Many writers find this a source of great stress.

This is, of course, true even of writing instructions and directions. Often, when describing a process, we find there are gaps in it. And when writing down directions, we come to realise that we may not know all aspects of the familiar sufficiently well to mediate the experience to someone else.

Teaching writing as translation

Translation is a skill that requires a lot of training and practice. In many ways, a translator needs to know more about both languages than a native speaker of either. And then they need to know about different ways of finding equivalent expressions between the two languages in such a way that the content expressed in the source language produces similar mental effects when reading in the target language. This is not easy. In fact, it is frequently impossible to achieve perfectly.

When I translate I often refer to a dictionary (such as slovnik.cz) that lists as many possible alternatives of words even if I know exactly what the original ‘means’. This is because I want to see multiple options of expressing something which may not be immediately triggered by my understanding of the whole.

But for this to work, I need to have done a lot of deliberate reading in both languages to know how they tend to express similar things. At the early stages, I may approach this more simply as learning to speak a language. I may learn that ‘commit to pursue’ is best translated as ‘zavázat se usilovat o’. But I have to back that up by a lot of reading in both languages, studying other translators’ work and making hypotheses about both languages and the differences between them. Eventually, this becomes second nature and to translate fluently, we need to ‘forget’ the rules and ‘just do it’.

So how could we apply this to teaching (academic) writing? We need to start by ensuring that students have enough facility in both the source and the target languages. We usually assume greater fluency in the source language (most translators work primarily in the direction of native to non-native). So in this case, we need to focus on the structures and ways of ‘academic writtenese’.

We can very much approach this as teaching a foreign language. Our first aim should be to help students acquire fluency in the language of academic writing. We need to give them some target structures to learn. This should ideally be based on an actual analysis of that writing rather than focusing on random salient features. But ultimately, the key element here is practice.

Then we also need to focus on helping the students develop better awareness of their native mentalese and how to best map its structures onto the structures of writtenese. We can do this by helping them write outlines, create mind maps, come up with relevant key words, and of course, read a lot of other people’s writing, think about it, and then write summaries in similar ways.

None of these are particularly revolutionary ideas and they are being used by writing teachers all over the world. What I’m hoping to do here is to provide a metaphor to help focus the efforts on particular aspects of what makes the translation from thought to writing difficult.

Writing as playing a musical instrument

One final analogy that can help us here is the idea of writing as playing a musical instrument. This analogy is in many ways even more apt. When we play a musical instrument, we are initially translating relatively vague musical ideas into actual notes (melodies and harmonies) by way of the structures given to us by the musical instrument.

We may start by learning some chords to accompany a song we hear but later we will progress into more details of musical theory which will allow us to express more elaborate ideas. But, in fact, this also allows us to have more those more elaborate ideas in the first place.

Initially, our ability to express musical ideas via an instrument (such as piano or guitar) will be limited by our skill. We may not even realize what exactly the idea in our head was until we’ve played it. And often, what we can play limits the ideas we have. Jazz teachers often say something like ‘sing your solos first and then play’ (others call it ‘audiation’). But this is not trivial and requires extensive training. Which is why one common advice for jazz musicians is to transcribe (or at least copy) famous songs and solos. But as you’re transcribing and copying, you’re supposed to notice patterns in how musical ideas are expressed. You can then recombine them to express what is in your ‘musical mind’.

But it seems that the musical ideas and their form of expression are never completely separate. They are not a pure translation but rather a co-creation. And this is true of any good translation and probably also ultimately true about any act of writing. We are using a different medium to express an existing idea but in the process, we are filling gaps in the ideas, creating new connections until we ultimately cannot be completely certain which came first.

As we get better at translation, music or writing, there are some levels about which the last part does not hold true. There are some ideas we can truly and faithfully translate from our head to paper, musical instrument or from one language to another. This is why practice is so important. But at the highest levels of difficulty, writing, translation and music making will always be acts of co-creation between the medium and the message.

Teaching writing as music

So finally, could we teach writing in the same way as we teach music? We certainly could. Just like teaching a foreign language, teaching music is mostly dependent on a lot of practice.

But perhaps there are some techniques that music teachers use that could be useful for both language teachers, translators and writing coaches.

One is the emphasis on patterns. The idea of practicing scales, licks, or chords relentlessly (up to hours a day) holds a lot of appeal. Perhaps we start teaching self-expression with writing too soon. Maybe we should give students some practice patterns to repeat in different combinations. Then we could tell them to just copy and then dissect parts of good texts. The idea of ‘mindless’ copying will probably stick in many teachers’ craws. But just analysing reading will never be enough. Students need the experience of writing some good writing. If only to develop some muscle memory. And while it should never be completely mindless, it should also perhaps not be completely meaningful from the very start. Of course, we could invent numerous variations on this approach to transform the texts in various fun ways while still making sure, students are writing extended chunks and developing fluency. The point is that we would not be focusing on self-expression but developing a language for self-expression.

Music teachers and students use what has been described by Anders Ericsson as ‘deliberate practice’. Ericsson gives the example of Benjamin Franklin who used similar techniques to improve his writing:

He first set out to see how closely he could reproduce the sentences in an article once he had forgotten their exact wording. So he chose several of the articles whose writing he admired and wrote down short descriptions of the content of each sentence—just enough to remind him what the sentence was about. After several days he tried to reproduce the articles from the hints he had written down. His goal was not so much to produce a word-for-word replica of the articles as to create his own articles that were as detailed and well written as the original. Having written his reproductions, he went back to the original articles, compared them with his own efforts, and corrected his versions where necessary. This taught him to express ideas clearly and cogently.

Obviously, this was not all there was to it, but it is very much reminiscent of what music students do. It seems to me that most beginner writers are often asked to do too much at the very start and they never get a chance to improve because they essentially give up too soon.

Writing is NOT foreign language, translation or music: The Unmetaphor

Writing is writing! It has its specific properties that we need to attend to if we want to see all of its complexities. We must use metaphors to help us do this but always by remembering that metaphors hide as much as they reveal. One useful way of understanding something is to create a sort of unmetaphor: a listing of similar things that are different from it in various respects. This is something that, while not uncommon, is done much less than it should be when using analogies.

Written language is not a foreign language

Some of the fundamental mental orientations of a language are shared between the written and spoken forms. This includes tense, aspect, modality, definiteness, case morphology, word categories, meanings of most function words, the shape of words, etc. These present some of the most significant difficulties to learners of foreign languages making it very difficult to acquire a second language by exposure alone after a certain age for most adults.

Writing, on the other hand, can be acquired predominantly by exposure alone for many (if not most) adults. There are many people who acquire native-like competence in the written code in the same way they acquired their spoken language competence (even if there are just as many who never do). And we must also be mindful (as Douglas Biber’s research revealed) that there is a bigger difference between some written genres then there is between writing and speech overall. So we should perhaps attend to that.

Writing is not translation

That writing is not actually translation is contained in the fact that written language is not actually a foreign language. There are many genres and registers in any language with their specific codes. And we could call going from one code to another translation much more easily than going from what I called ‘mentalese’ and ‘writtenese’. (Again, the work of Douglas Biber should be the first port of call for anyone interested in this aspect of writing.)

But most importantly, what I called ‘mentalese’ does not actually have the form of a language. Individuals differ in how they represent thoughts that end up being represented by very similar sentences. Some people rely on images, others on words. For some, the mental images more schematic and for others, they have more filled in details. For instance, Lakoff asked how different people imagine the ‘hand’ in ‘Keep somebody’s at arm’s length’. And the responses he got were that for some the hand is oriented with the palm out, others with the palm in.  For some, it includes a sleeve, for others it does not. Etc.

Writing is not music

I’ve already written about the 8 ways in which language is not like music. And they all apply to writing, as well. The key difference for us here is that music cannot express propositions. This means that musical expression can be a lot freer than expressing ideas through writing.

We could argue that writing is more like music than spoken language because it requires some kind of an instrument. Pen, paper, computer, etc. But we usually learn these independently of the skill of expressing ourselves through writing. My ability to play the piano is much more closely tied to my ability to express my musical meanings. However, people write just as expressive prose by the hunt and peck method as when they touch type. One can even dictate a ‘written text’ – that’s how independent it is of the method of production.

Of course, improving one’s facility with the tools of production can improve the writing output just by removing barriers. This is why students are well-advised to learn to touch type or to use a speech-to-text method if they struggle for other reasons (e.g. visual impairment or dyslexia). But when it comes down to it, this is just writing down words and as we established, writing in most senses is more than that.

Conclusions and limitations

Ultimately, writing and translation are not the same. Just as writing and music are not the same. But there are enough similarities to make it worthwhile learning from each other.

Many writers have developed great skills by the ‘tried and tested’ approach of ‘just doing it’. But we also know that even many people who do write a lot never become very ‘good’ at it. They struggle with the mechanics, ability to express cogently what’s in their minds, or just hate everything about it.

For some beginner writers, the worst thing we could do is give them a lot of mindless exercises. These people will want to do it first and would hate to be held back. Just like many students of languages or music like dive off the deep end. But equally, for many others, telling them to ‘just do it’ is the perfect recipe for developing an inferiority complex or downright phobias of writing.

But all of these writers will need lots of practice – regardless of whether we provide lots of ladders and scaffolding or just put a trampoline next to the edifice of their skill. In this, writing is exactly like music, language and translation. You can only get better at it by doing it. A lot!

I started with a quote from Wittgenstein. But he also famously said in summarising his book:

What can be said at all can be said clearly; and whereof one cannot speak thereof one must be silent.

I think we saw here that this is not necessarily how the act of writing presents itself to most people.

He then continued:

The book will, therefore, draw a limit to thinking, or rather—not to thinking, but to the expression of thoughts; for, in order to draw a limit to thinking we should have to be able to think both sides of this limit (we should therefore have to be able to think what cannot be thought). The limit can, therefore, only be drawn in language and what lies on the other side of the limit will be simply nonsense.

This is was the so-called “early Wittgenstein” before the language games and family resemblances. He spent the rest of his career unpicking this boundary of sense and non-sense. Coming to terms with the fact that what is thought and what is its expression are not straightforward matters.

So all the metaphors notwithstanding, we should be mindful of the constant tensions involved in the writing process and be compassionate with those who struggle to navigate them.

What would make linguistics a better science? Science as a metaphor

Share

Background

This is a lightly edited version of a comment posted on Martin Haspelmath’s blog post “Against traditional grammar – and for normal science in linguistics“.  In it he offers a critique of the current linguistic scene as being unclear as to its goals and in need of better definitions. He proposes ‘normal science’ as an alternative:

In many fields of science, comparative research is based on objective measurements, not on categories that are hoped to be universal natural kinds. In linguistics, we can work with objectively defined comparative concepts (Haspelmath 2010).

While I am in broad agreement with the critique, I’m not sure the solution is going to lead to a ‘better science’ of linguistics. (Also, I’m not sure that this is an accurate description of how science actually works.)

Problem with ‘normal science’ approach to linguistics

I would say that the problem with the ‘normal science’ approach is that it makes it seem natural to turn to a structural description as a mode of ‘doing good linguistics’. But I think that this is misleading as to the nature of language. The current challenge of the radical potential coming from the constructionism (covering Fillmore, Croft, Goldberg) on the one hand and cognitive semantics (Lakoff, Talmy, Langacker) on the other makes a purely structural description (even imbued with functionalism) less appealing as the foundation of a newly scientificised linguistics.

It’s curious that it’s physics and chemistry that get mentioned in this context with their fully mathematised personas and not biology or geography. In both of those, precise definitions are much more provisional and iterative. Even foundational terms such as species or gene are much more fluid and less well defined than it might seem (I recommend Keller’s ‘Century of the Gene’ for an account of how discrepancies on how gene is defined among various labs was actually beneficial for the development of genetics).

That’s not to say that I strongly disagree with any of Haspelmath’s proposals but they don’t particularly make me excited to do linguistics. I found Dixon’s ‘Basic Linguistic Theory’ an exhilarating read but it was not because I felt that Dixon’s programme would lead to more consistency but because it was a radically new proposal (despite his claims to the contrary) for a theoretical basis for a comparative linguistic research agenda. Which is also why I like Haspelmath’s body of work (exemplified in projects such as the World Atlas of Language Structures and Glottolog).

But I doubt that the road ahead is in better definitions. I’m not opposed to them just skeptical that they will lead to much. The road ahead is in better data and better theory. I think that between corpus linguistics, frame semantics and construction grammar we can get both. I proposed the analogy of ‘dictionary and grammar being to language what standing on one foot is to running‘ . I think linguistics needs to embrace the dynamism of language as a human property rather than as a fixed effect (to borrow Clark’s phrase). Fillmore and Kay’s early writing on construction grammar was a first step but things seemed to have settled into the bad old ways of static structural description.

Data and theory need each other in a dialectic fashion. You need data to create a theory but you needed some proto-theory to see the data in the first place. And then you need your theory to collect more data and that data then further shapes your theory which in turns let you see the data in different ways. The difference between biology and linguistics is that our proto-theories of the biological world correspond much better to the dynamic structures which can be theorized (modeled) based on systematic data collection and its modeling. Which is why folk taxonomies of the biological world are much closer to those of botany or zoology than folk taxonomies of language are to linguistic structures. (They are much more elaborate – to start with – at least at the level available to human perception.)

My proposal is to take seriously human ability to reflect (hypostaticaly) on the way they speak (cf. Talmy’s defense of introspection) because this is at the start of any process that bootstraps a theory of language. We then need to be mindful of the way this awareness interacts with the subconscious automaticity in which the patterns of regularity we call structures seem to be used. In the same way that Fillmore and Kay asked any theory of grammar to account for the exceptions (or even take them as a starting point), I’d want to ask any theory of language to take bilingualism and code mixing as its starting point (inspired by Elaine Chaika) and take seriously the variability of acquisition of the ability to automatically use those structures.

None of this is precludes or denies the utility of the great work of linguistics like Haspemath. But it is what I think would lead to linguistics being a ‘better’ science (at least, in the sense of Wissenschaft or ‘natural philosophy’ rather than in the sense implied by the physics envy which often characterizes these efforts).

Update: 

After I finished writing this, I was listening to this episode of the Unsupervised Thinking podcast where the group was discussing two papers critiquing some of the theoretical foundations of biology (“Can a biologist fix a radio?”) and neuroscience (“Could a Neuroscientist Understand a Microprocessor?”). The general thrust of the discussion was that better definitions would be important. Because they would allow better measurement and thus quantifiable models.  But the discussion also veered towards the question of theory and pre-theoretical knowledge. To me it underscored the tension between data and theory.

My concern is only about the assumption that definitions are the solution. But I’d say that a definition (unless purely disambiguating of polysemy) is just a distillation or a snapshot of a slice in time in the never-ending push and pull of data and the model used to make sense of it as well as collect it (otherwise known as theory). This is not that different from the definition of a lexical item in a dictionary.

That is not to deny the heuristic usefulness of definitions. Which reminded me of the critique of modern axiomatic mathematics (in particular set theory and number theory) exemplified by NJ Wildberger in his online courses on Math Foundations. Wildberger is also calling for more precise definitions in mathematics and less reliance on axioms.

Future directions:

I outlined some of the fundamental epistemological problems with definitions (as a species of a referential theory of meaning).

I’m working on a more extensive elaboration of some of the issues of comparing epistemic heuristics used to model the physical and the social world with a subtitle “The differential susceptibility of units to idealization in the social and physical realms” that addresses some of the questions I outlined above.

In it, I want to suggest that the key difference between the social and physical sciences is due to how easy it is to usefully idealize units and sets in the physical and the social world. Key passage from this:

All of physics is based on idealization. You have ideal gas, perfect motion, perfect vacuum, etc. All of Newtonian physics  is based on the mathematical description of a world where things like friction don’t exist. An ideal world, if you will. Platonic, almost. And it turned out that this type of idealization can take us extremely far, if we let engineers loose on it.

Because all the progress we attribute to science has really been made by engineers. People who take a ballistic curve and ask ‘how about we add a little cross wind’. The modern world of technology around us is all built on tolerances – encoded in books of tables describing how far can we take the idealized formulas of science into the non-ideal conditions of the ‘real’ world.

In the social world, the ideal individual and ideal society are more difficult to treat as units of analysis than perfect vacuum or ideal gas. But even if we could define them, there’s far less we can do with them that would make them anywhere as useful as the idealizations of physics and chemistry. That’s why engineering a solution is a positive description when we talk about the physical world but a negative when we talk about social.

3 burning issues in the study of metaphor

Share

I’m not sure how ‘burning’ these issues are as such but if they’re not, I’d propose that they deserve to have some kindling or other accelerant thrown on them.

1. What is the interaction between automatic metaphor processing and deliberate metaphor application?

Metaphors have always been an attractive subjects of study. But they have seen an explosionof interest since ‘Metaphors we live by’ by Lakoff and Johnson. In an almost Freudian turn, these previously seemingly superfluous baubles of language and mind, became central to how we think and speak. All of a sudden, it appeared that metaphors reveal something deeper about our mind that would otherwise remain hidden from view.

But our ability to construct and deconstruct metaphors was mostly left unexamined. But this happens ‘literally’ all the time. People test the limits of ‘metaphor’ through all kinds of dicoursive patterns. From, saying things like ‘X is more like Y’ to ‘X is actually Y’ or even ‘X is like Y because’.

How does this interact with the automatic, instantaneous and unconscious processing of language. (Let’s not forget that this is more common)

2. What is the relationship between the cognitive (conceptual) and textual metaphor?

Another way to pose this question is: What happens in text and cognition in between all the metaphors? Many approaches to the study of metaphor only focus on the metaphors they see. They seem to ignore all the text and thought in between the metaphorical. But, often, that is most of goes on.

With a bit of effort, metaphors can be seen everywhere but they are not the same kind of thing. ‘Time is money’, ‘stop wasting my time’, and ‘we spent some time together’ are all metaphorical and relying on the same conceptual metaphor of TIME IS A SOMETHING THAT CAN BE EXCHANGED. But they are clearly not doing the same job of work for the speaker and will be interpreted very differently by the listener.

But there’s even more at stake. Imagine a sentence like ‘Stop wasting my time. I could have been weeding my garden spending time with my children instead of listening to you.’ Obviously, the ‘wasting time’ plays a different role than in a sentence ‘Stop wasting my time. My time is money and when you waste my time, you waste my money.’ The coceptual underpinnings are the same, but way they can be marshalled into meaning is different.

Metaphor analysts are only too happy to ignore the context – which could often be most of the text. I propose that we need a better model for accounting for metaphor in use.

3. What are the different processes used to process figurative language

There are 2 broad schools of the psychology of metaphor. They are represented by the work of Sam Glucksberg and Raymond Gibbs. The difference between them can be summarised as ‘metaphor as polysemy’ vs ‘metaphor as cognition’. Metaphor, according to the first, is only a kind of additional meaning, words or phrases have. While the second approach sees it as a deep interconnected underpinning of our language and thought.

Personally, I’m much closer to the cognitive approach but it’s hard to deny that the experimental evidence is all over the place. The more I study metaphor, the more I’m convinced that we need a unified theory of metaphor processing that takes both approaches into account. But I don’t pretend I have a very clear idea of where to even start.

I think such a theory would also have to account for differences in how inviduals process metaphors. There are figurative language pathologies (e.g. gaps in ability to process metaphor is associated with autism). But clearly, there are also gradations in how well individuals can process metaphor.

Any one individual is also going to vary over time and specific instances in how much they are able  and/or willing to consider something to be metaphorical. Let’s take the example of ‘education is business’. Some people may not consider this to be a metaphor and will consider it a straightforward descriptive statement along the lines of ‘dolphins are mammals’. Others will treat it more or less propositionally but will dispute it on the grounds that ‘education is education’, and therefore clearly not business. But those same people may pursue some of the metaphorical mappings to bolster their arguments. E.g. ‘Education is business and therefore, teachers need to be more productive.’ or ‘Education is not business because schools cannot go bankcrupt’.

Bonus issue: What are the cognitive foundations shared by metaphor with the rest of language?

This is not really a burning issue for metaphor studies so much as it is one for linguistics. Specifically semantics and pragmatics but also syntax and lexicography.

If we think of metaphor as conceptual domain (frame) mapping, we find that this is fundamental to all of language. Our understanding of attributes and predicates relies on the same ability to project between 2 domains as does understanding metaphor. (Although, there seems to be some additional processing burden on novel metaphors).

Even seemingly simple predicates such as ‘is white’ or ‘is food’ require a projection between domains.

Compare:

  1. Our car is white.
  2. Milk chocolate is white.
  3. His hair is white.

Our ability to understand 1 – 3 requires that we map the domain of the ‘subject’ on to the domain of the ‘is white’ predicate. Chocolate is white through and through whereas cars are only white in certain parts (usually not tires). Hair, on the other hand, is white in different ways. And in fact, ‘is white’ can never be fully informative when it comes to hair because there are too many models. In fact, it is even possible to have opposite attributes mean the same thing. ‘Nazi holocaust’ and ‘Jewish holocaust’ are both use to label the same event (with similar frequency) and yet it is clear that they refer to one event. But this ‘clarity of meaning’ depends on projections between various domains. Some of these include ‘encyclopedic knowledge’. For instance, ‘Hungarian holocaust’ does not possess such clarity outside of specialist circles.

It appears that understanding simple predicates relies on the same processes as understanding metaphor does. What makes metaphor special then? Do we perhaps need to return to a more traditional view of metaphor as a rhetorical device but change the way we think about language?

That is what I’ve been doing in my thinking about language and metaphor but most linguistic theories treat these as unremarkable phenomena. This leads them to ignore some pretty fundamental things about language.

Does machine learning produce mental representations?

Share

TL;DR

  • Why is this important? Many people believe that mental representations are the next goal for ML and a prerequisite for AGI.
  • Does machine learning produce mental representations equivalent to human ones in kind (if not in quality or quantity)? Definitely not, and there is no clear pathway from current approaches to a place where it would. But it is worth noting that mental representations in humans are also not something straightforward to identify or describe.
  • Is there a currently viable approach to ML that could eventually lead to mental representations with more engineering? It appears not but then again, no one expected neural nets would get so successful.

Update: Further discussion on Reddit.

Background

Over the last few months, I’ve been catching up more systematically on what’s been happening in machine learning and AI research in the last 5 years or so and noticed that a lot of people are starting to talk about the neural net developing a ‘mental’ representation of the problem at hand. As someone who’s preoccupied with mental representations a lot, this struck me as odd because what was being described for the machine learning algorithms did not seem to match what else we know about mental representations.

So I’ve been formulating this post when I was pointed to this interview with Judea Pearl. And he makes exactly the same point:

“That sounds like sacrilege, to say that all the impressive achievements of deep learning amount to just fitting a curve to data. From the point of view of the mathematical hierarchy, no matter how skillfully you manipulate the data and what you read into the data when you manipulate it, it’s still a curve-fitting exercise, albeit complex and nontrivial.”

He continues:

“If a machine does not have a model of reality, you cannot expect the machine to behave intelligently in that reality.”

What does this model of reality look like? Pearl seems to reduce it to ‘cause and effect’ but I would suggest that the model needs more than that (Note: I haven’t read his book just the interview and this intro.)

What are mental representations?

Mental representations are all sorts of images (ranging from rich to schematic and from static to dynamic) in our mind on which we draw sometimes consciously but mostly unconsciously to deal with the world. They are essential for producing and understanding language (from even the simplest sentence) and for basic reasoning. They can be represented as schemas, rich images, scenarios, scripts, dictionaries or encyclopedic entries. They can be in many modalities – speech, sound, image, moving picture.

Here are some examples to illustrate.

Static schemas

What does ‘it’ refer to in pairs of sentences such as these (example from here):

  1. The trophy wouldn’t fit into the suitcase because it was too big.
  2. The trophy wouldn’t fit into the suitcase because it was too small.

It takes no effort at all for a human to determine that it in (1) refers to trophy and in (2) to suitcase. Why, because, we have schemas of containment and we know almost intuitively that big things don’t fit into smaller things. And when we project that schema onto trophy and suitcase we immediately know what has to be too big or too small in order for one not to fit into the other.

You can even do it with a single sentence as in Jane is standing behind Clare so you cannot see her. It is clear that her refers to Jane and not Clare but only because we can project a schema of 2 similar-sized objects positioned relative to the observer’s line of sight.

So we also know that only sentence 1 below makes sense because of the schema we have for things of unequal size being positioned relative to each other and their impact on our ability to see them.

  1. The statue is in front of the cathedral.
  2. The cathedral is in front of the statue.

However, unlike with the trophy and suitcase, it is possible to imagine contexts in which sentence 2 would be acceptable. For instance, in a board game where all objects are printed on blocks of the same size and positioned on a 2D space.

This is to illustrate that the schemas are not static but interact with the rich conceptualisations we create in context.

Force dynamics

This is a notion pioneered by Leonard Talmy that explains many aspects of cognitive and linguistic processes through dynamic schemas of proportional interaction. Thus we know that all things being equal, bigger things will influence smaller things, faster things will overtake slower things, etc.

So we can immediately interpret the it in sentences such as:

  1. The foot hit the ball and it flew off.
  2. The bird landed on the perch and it fell apart.

But we also apply these to more abstract domains. We can thus easily interpret the situations behind these 2 sentences:

  1. The mother walked in and the baby calmed down.
  2. The baby walked in and the mother calmed down.

If asked to tell the story that led to 1 or 2, people would converge on very similar scenarios around each sentence.

Knowledge of the world

Sometimes, we marshall quite rich (encyclopedic) knowledge of the world in understand what we hear or see. Imagine what is required to match the following 2 pairs of sentences (drawing on Langacker):

  1. The Normans conquered England with …
  2. The Smiths conquered England with …
  1. … their moody music.
    b. … their superior army.

Obviously the right pairings are 1b and 2a. But none of this is contained in the surface form. We must have the ‘encyclopedic’ knowledge of who The Normans and The Smiths were but also the force dynamic schemas of who can conquer who.

So on hearing the sentence ‘Mr and Mrs Smith conquered Britain’, we would be looking for some metaphorical mapping to explain the mismatch between the force we know conquering requires and the force we know a married couple can exert. With sufficiently rich knowledge, this is immediately obvious as in ‘John and Yoko conquered America.’

How does machine learning do on interpreting human mental representations?

For AI, examples such as the above are a difficult challenge. It was recently proposed that a much more effective and objective Turing test would be to ask an AI to interepret sentences such as these under the [ Winograd Schema Challenge] (https://en.wikipedia.org/wiki/Winograd_Schema_Challenge).

A database of pairs of sentences such as:

  1. The city councilmen refused the demonstrators a permit because they feared violence.
  2. The city councilmen refused the demonstrators a permit because they advocated violence.

This has the great advantage of perfect objectivity. Unlike with the Turing test, it is always clear which answer is correct.

The best machine learning algorithms use various tricks but they still only do slightly better than chance (57%) at interpreting these schemas.

The only problem is that it is quite hard to construct these pairs in a way that could not be solved with simple statistical distributions. For instance, the Smiths and Normans example above could be easily resolved with current techniques simply by searching which words occur most frequently together.

Also, it is not clear how the schematic and force dynamic aspects interact with the encyclopedic aspects. Can you have one without the other? Can we classify the Winograd schema sentences into different types, some of which would be more suspectible to ML approaches?

Do mental representations exist?

There is a school of thought that claims that mental representations do not actually exist. There is nothing like what I described above in the brain. It is actually just a result of perceptual task orientation. This is the ecological approach developed in the study of perception and physical manipulation (such as throwing or catching a ball).

I am always very sceptical of any approach that requires we find some bits of information resembling what we see stored in the brain. Which is why I am quite sympathetic to the notion that there are no actual mental representations directly encoded into the synaptic activations of our brain.

But even if all of these were just surface representations of completely different neural processes, it is undeniable that something like mental representations is necessary to explain how we think at speak at some level. At the very least to articulate the problems that have to be solved by machine learning.

Note: I have completely ignored the problem of embodiment which would make things even more complicated. Our bodily experience of the world is definitely involved. But to what extent are our bodies actually a part of the reasoning process (as opposed to the brain as an independent computational contrl module) is a subject of hot debate.

How does machine learning represent the problem space?

Now, ML experts are not completely wrong to speak about representations. Neural nets certainly build some sort of representation of the problem space (note, I don’t call it world). We have 4 sources of evidence:

  1. Structure of data inputs: Everything is a vector encoded as a string of numbers.
  2. Patterns of activation in the neural nets (weights): This is where the ‘curve fitting’ happens.
  3. Performance on real world tasks: More reliable than humans on dog breed recognition but penguins can also be identified as pandas.
  4. Adversarial attacks: Adding seemingly random and imperceptible noise to a image or sound can make it produce radically different outputs.

If we take together the vector inputs and the weights on the nodes in the neural net, we have one level of representation. But that is perhaps the less interesting and as complexity increases, it becomes impossible to truly figure out much about it.

But is it possible that all of that actually creates some intermediate layer that has the same representational properties as mental representations? I would argue that at this stage, it is all inputs and weights and all the representational aspects are provided by the human interpreting the outputs. But if we only had the outputs, we could still posit some representational aspects. But the adversarial attacks reveal that the representational level is missing.

Note: Humans can also be subject to adversarial attacks with all sorts of perceptual and cognitive illusions. They seem to be on a different representational level to me but they would be worth exploring further in this context.

Update: A commenter on Reddit suggested that I look at this post on feature visualisation and I think that mostly supports my point. It looks like there are lots of representations shown in that article, but they are really just visualisations of what inputs lead to certain neuron activations on specific layers of the neural net. Those are not ‘representations’ the neural net has independent access to. I think in the same way, we would not think of Pavlov’s dogs salivating on the sounds of the bell as having ‘mental representation’ of the ‘bell means food’ causal connection. Perhaps we could rephrase the question of whether training a neural net is similar to classical or operant conditioning.  and what that means with respect to the question of representation.

Can we create mental representations in machines?

Judea Pearl thinks that nothing current ML is doing is going to lead to a ‘model of the world’ or as I call it ‘mental representations’. But I’m skeptical that his solution is a path to mental representations either:

“The first step, one that will take place in maybe 10 years, is that conceptual models of reality will be programmed by humans.”

This is what the early AI expert systems tried to do but it proved very elusive. One example of manually coding mental representations is FrameNet, a database of words linked to semantic frames but it barely scratches the surface. For instance, here’s the frame for container which links to suitcase. But that still doesn’t help with the idea of trophy being sometimes small enough to fit and sometimes too big. I can see how FrameNet could be used on very small subsets of problems but I don’t see a way for scaling it up in a way that could take into account everything involved in the examples I mentioned. We are faced with the curse of dimensionality here. The possible combinations just grow too fast for us to compute them.

I’m also not sure that simply running more data through bigger and bigger RNNs or CNNs will get us there either. But I can’t rule out that brute force won’t get us close enough for it not to matter that mental representations are not involved.

Perhaps, if label enough text of some subdomain with framenet schemas, we could train a neural net on this. But that will help with the examples where rich knowledge of the world is not required. We can combine a schema of a suitcase and a trophy with that of ‘fit’ and match ‘it’ with the more likely antecedent. Would that approach help with the demonstrators and councilmen? But even if so, the Winograd Schema Challenge is only an artificially constructed set of sentence pairs designed for a particular purpose. The mental representations involved crop up everywhere all the time. So we not only need a way of invoking mental representations but also a way to decide if they are needed and, if so, which ones.

Machine learning fast and slow up the garden path

Let’s imagine that we can somehow engineer a solution that can beat the Winograd Schema Challenge. Would that mean that it has created mental representations? We may want to reach for Searl’s ‘Chinese Room Argument’ and the various responses to it. But I don’t think we need to go that deep.

One big aspect of human intelligence that is often lumped together with the rest is metacognition. This is the ability to bring the process of thinking (or speaking) to conscious awarenes and control it (at least to a degree). This is reminiscent of Kahneman’s two systems in ‘Thinking Fast and Slow’.

Machine learning produces almost exclusively ‘fast thinking’ – instantaneous matching of inputs to outputs. It is the great advance over previous expert system models of AI which tried to reproduce slow thinking.

Take for instance the famous Garden path sentences. Compare these 2:

  1. The horse raced past the barn quickly.
  2. The horse raced past the barn fell.

Imagine the mental effort required to pause and retrace your steps when you reach the word ‘fell’ in the second sentence. It is a combination of instantanous production of mental images that crash and slow deliberate parsing of the sentence to construct a new image that is consistent with our knowledge of the world and the syntactic schema used to generate it.

Up until the advent of stochastic approaches to machine learning in the 1990s (and neural nets in 2010s), most AI systems tried to reproduce the slow thinking through expert systems encoded as decision trees. But they mostly failed because the slow thinking only works because of the fast thinking which provides the inputs to it. Now neural nets can match complex patterns that we once thought impossible. But they do it very differently from us. There doesn’t seem to be much thinking about how to go about developing the sort of metacognition that is required to combine the two. All of the conditional decisionmaking around what to do with the outputs of ML algorithms has to be hardcoded. Alexa can recognize my saying ‘turn on bedroom light’ but I had to give it a name and if I want to make it part of a more complex process (make sure bedroom light is off when I leave home), I have to go to IFTTT.

I don’t see how Pearl’s approach will take us there. But I don’t see an alternative, either. Perhaps, the mental representations will emerge epiphenomenally as the neural nets grow and receive more sophisticated inputs about the spatial nature of world (rather than converting everything to vectors). Maybe they will be able to generate their own schemas as training inputs. I doubt it, but wouldn’t want to bet against it.

What is just as likely is that we will reach a plateau (maybe even resulting in a new AI winter) that will only see incremental improvements and won’t take the next step until a completely new paradigm emerges (which may not happen for decades if ever).

Conclusion

It is not always obvious that more in-depth knowledge of a domain contributes to a better model of it. We are just as likely to overfit our models as to improve them when we dive too deep. But I think that mental representations at least reveal an important problem domain which should be somehow reflected in what machines are being taught to learn.

Update

In response to a comment on Reddit, I wanted to add the following qualification.

I think I ended up sounding a bit more certain than I feel. I know I’m being speculative but I note that all the critics are pointing at hypotheticals and picking at my definition of mental representation (which is not necessarily unwarranted).

But what I would like to hear is a description of the next 5 specific problems to be solved to get nearer to say 75% on the Winograd Schema Challenge that can then be built on further (ie not just hacking around collocation patterns Watson style).

I also wanted to note that I omitted a whole section on the importance of collocability in language with a reference to Michael Hoey’s work on Lexical Priming, which I think is one of the 2 most important contributions to the study of language in the last 20 years, the other being William Crofts Radical Construction Grammar. The reading of which would be of benefit to many ML researchers along with Fauconnier’s and Turner’s The Way We Think.

Not ships in the night: Metaphor and simile as process

Share

In some circles (rhetoric and analytics philosophy come to mind), much is made of the difference between metaphor and simile.

(Rhetoricians pay attention to it because they like taxonomies of communicative devices and analytic philosophers spend time on it because of their commitment to a truth-theoretical account of meaning and naive assumptions about compositionality).

It is true that their surface and communicative differences have an impact in certain contexts but if we’re interested in the conceptual underpinnings of metaphor, we’re more likely to ignore the distinction altogether.

But what’s even more interesting, is  to think about metaphor and simile as just part of the process of interpersonal meaning construction.  Consider this quote from a blog on macroeconomics:

[1a] Think of [1b] the company as a ship. [2] The captain has steered the ship too close to the rocks, and seeing the impending disaster has flown off in the ship’s helicopter and with all the cash he could find. After the boat hit the rocks no lives were lost, but many of the passengers had a terrifying ordeal in the water and many lost possessions, and the crew lost their jobs. [3] Now if this had happened to a real ship you would expect the captain to be in jail stripped of any ill gotten gains. [4] But because this ship is a corporation its captains are free and keep all their salary and bonuses. [5] The Board and auditors which should have done something to correct the ship’s disastrous course also suffer no loss.

Now, this is really a single conceptual creation but it happens in about 5 moves which I highlighted above. (Note: I picked these 5 as an illustrative heuristic but this is not to assume some fixed sequence).

[1] The first move establishes an idea of similarity through a simile. But it is not in the traditional form of ‘X is like Y’. Rather, it starts with the performative ‘Think of’ [1a] and then uses the simile ‘as’. [1b]. ‘Think of X as Y’ is a common construction but it is rarely seen as an example in discussions of similes.

[2] This section lays out an understanding of the source domain for the metaphorical projection. It also sets the limit on the projection in that it is talking about ‘company as a ship traveling through water’ in this scenario, not a ship as a metonym for its internal structure (for instance, the similarities in the organisational structure of ships and companies.) This is another very common aspect of metaphor discourse that is mostly ignored. It is commonly deployed as an instrument in the process of what I like to call ‘frame negotiation’. On the surface, this part seems like a narrative with mostly propositional content that could easily stand alone. But…

[3] By saying, ‘if this happened to a real ship’ the author immediately puts the preceding segment into question as an innocent proposition and reveals that it was serving a metaphorical purpose all along. Not that any of the readers were really lulled into a false sense of security, nor that the author was intending some dramatic reveal. But it is an interesting illustration of how the process of constructing analogies contains many parts.

[4] This part looks like a straightforward metaphor: ‘the ship is a corporation’ but it is flipped around (one would expect ‘the corporation is a ship’. This move links [2] and [3] and reminds us that [1].

[5] This last bit seems to refer to both domains at once. ‘The board and the auditors’ to the business case and ‘ships course’ to the narrative in the simile. But we could even more profitably think of it as referring to this new blended domain in which we have a hypothetical model in which both the shipping and business characteristics were integrated.

But the story does not end there, even though people who are interested in metaphors often feel that they’ve done enough at this stage (if they ever reach it). My recommended heuristic for metaphor analysts is to always look at what comes next. This is the start of the following paragraph:

To say this reflects everything that is wrong with neoliberalism is I think too imprecise. [1] I also think focusing on the fact that Carillion was a company built around public sector contracts misses the point. (I discussed this aspect in an earlier post.)

If you study metaphor in context, this will not surprise you. The blend is projected into another domain that is in a complex relationship to what precedes and what follows. This is far too conceptually intricate to take apart here but it is of course completely communicatively transparent to the reader and would have required little constructive effort on the part of the author (who is most likely to have spent time on constructing the simile/metaphor and its mappings but little on their embedding into the syntactic and textual weave that give it its intricacy).

In the context of the whole text, this is a local metaphor that plays as much an affective as it does a  cognitive role. It opens up some conceptual spaces but does not structure the whole argument.

The metaphor comes up again later and in this case it also plays the role of an anaphor by linking 2 sections of the text:

Few people would think that never being able to captain a ship again was a sufficient disincentive for the imaginary captain who steered his boat too close to the rocks.

Also of note is the use of the word ‘imaginary’ which puts that statement somewhere between a metaphor (similarity expressed as identity) and simile (similarity expressed as comparison).

There are two lessons here:

  1. The distinction between metaphor and simile could be useful in certain contexts but in practice, their use blends together and is not always easy to establish boundaries between them. But even if we could, the underlying cognition is the same (even if truth-conditionally they may differ on the surface). We could even complicate things further and introduce terms such as analogy, allegory, or even parable in this context but it is hard to see how much they would help us elucidate what is going on.

  2. Both metaphor and simile are not static components of a larger whole (like bricks in a wall or words in a dictionary). They are surface aspects of a rich and dynamic process of meaning making.  And the meaning is ‘literally’ (but not really literally) being made here right in front of our eyes or rather by our eyes.  What metaphor and simile (or the sort of hybrid metasimile present here) do is  help structure the conceptual spaces (frames) being created but they are not doing it alone. There are also narratives, schemas, propositions,  definitions, etc. All of these help fill out the pool of meaning into which we may slowly immerse ourselves or hurtle into headlong.  This is not easy to see if we only look at metaphor and simile outside their natural habitat of real discourse. Let that be a lesson to us.

How to read ‘Women, Fire and Dangerous Things’: Guide to essential reading on human cognition

Share

Note:

These are rough notes for a metaphor reading group, not a continuous narrative. Any comments, corrections or elaborations are welcome.

Why should you read WFDT?

Women, Fire, and Dangerous Things: What Categories Reveal About the Mind is still a significantly underappreciated and (despite its high citation count) not-enough-read book that has a lot to contribute to thinking about how the mind works.

I think it provides one of the most concise and explicit models for how to think about the mind and language from a cognitive perspective. I also find its argument against the still prevalent approach to language and the mind as essentially fixed objects very compelling.

The thing that has been particularly underused in subsequent scholarship is the concept of ‘ICMs’ or ‘Idealised Cognitive Models’ which both puts metaphor (for work on which Lakoff is most well known) in its rightful context but also outlines what we should look for when we think about things like frames, models, scripts, scenarios, etc. Using this concept would have avoided many undue simplifications in work in the social sciences and humanities.

Why this guide

Unfortunately, the concision and explicitness I extolled above is surrounded by hundreds of pages of arguments and elaborations that are often less well-thought out than the central thesis and have been a vector for criticism (I’ve responded to some of these in my review of Verena Haser’s book).

As somebody who translated the whole book into Czech and penned extensive commentary on its relevance to the structuralist linguistic tradition, I have perhaps spent more time with it than most people other than the author and his editors.

Which is why when people ask me whether to read it, I usually recommend an abreviated tour of the core argument with some selections depending on the individual’s interest.

Here are some of my suggestions.

Chapters everyone should read

Chapters 3, 4, 5, 6 – Core contribution of the book – Fundamental structuring principles of human cognition

These four chapters summarize what I think everybody who thinks about language, mind and society should know about how categories work. Even if it is not necessarily the last word on every (or any) aspect, it should be the starting point for inquiry.

All the key concepts (see below) are outlined here.

Preface and Chapter 1 – Outline of the whole argument and its implications

These brief chapters lay out succinctly and, I think very clearly, the overall argument of the book and its implications. This is where he outlines the core of the critique of objectivism which I think is very important (if itself open to criticism).

Chapter 2: Precursors

This is where he outlines the broader panoply of thinkers and research outcomes in recent intellectual history whose insights this books tries to systematise and take further.

The chapter takes up some of the key thinkers who have been critical of the established paradigm. Read it not necessarily for understanding them but for a way of thinking about their work in the context of this book.

Case studies

The case studies represent a large chunk of the book and few people will read all 3. But I think at least one of them should be part of any reading of the book. Most people will be drawn to number 1 on metaphor but I find that number 2 shows off the key concepts in most depth. It will require some focus and patience from non-linguists but I think is worth the effort.

Case study 3 is perhaps too linguistic (even though it introduces the important concept of constructions) for most non-linguist.

Key concepts

No matter how the book is read, these are the key concepts I think people should walk away with understanding.

Idealized Cognitive Models (also called Frames in Lakoff’s later work)

I don’t know of any more systematic treatment of how our conceptual system is structured than this. It is not necessarily the last word but should not be overlooked.

Radial Categories

When people talk about family resemblances they ignore the complexity of the conceptual work that goes into them. Radial categories give a good sense of that depth.

Schemas and rich images

While image schemas are still a bit controversial as actual cognitive constructs, Lakoff’s treatment of them alongside rich images shows the importance of both as heuristics to interpreting cognitive phenomena.

Objectivism vs Basic Realism

Although objectivism (nothing to do with Ayn Rand) is not a position taken by any practicing philosophers and feels a bit straw-manny, I find Lakoff’s outline of it eerily familiar as I read works across the humanities and social sciences, let alone philosophy. When people read the description, they should avoid dismissing it with ‘of course nobody thinks that’ and reflect on how many people approach problems of mind and language as if they did think that.

Prototype effects and basic-level categories

These concepts are not original to Lakoff but are essential to understanding the others.

Role of metaphor and metonymy

Lakoff is best known for his earlier work on metaphor (which is why figurative language is not a key concept in itself) but this book puts metaphor and metonymy in perspective of the broader cognition.

Embodiment and motivation

Embodiment is an idea thrown around a lot these days. Lakoff’s is an important early contribution that shows some of the actual interaction between embodiment and cognition.

I find it particularly relevant when he talks about how concepts are motivated but not determined by embodied cognition.

Constructions

Lakoff’s work was taking shape alongside Fillmore’s work on construction grammar and Langacker’s on cognitive grammar. While the current construction grammar paradigm is much more influenced by those, I think it is still worth reading Lakoff for his contribution here. Particularly case studies 2 and 3 are great examples of the power of this approach.

Additional chapters of interest

Elaborations of core concepts

Chapters 17 and 18 elaborate on the core concepts in important ways but many people never reach them because they follow a lot of work on philosophical implications.

Chapter 17 on Cognitive Semantics takes another more deeper look at ICMs (idealized cognitive models) across various dimensions.

Chapter 18 deals with the question of how conceptual categories work across languages in the context of relativism. The name of the book is derived from a non-English example but this takes the question of universals and language specificity head on. Perhaps not the in the most comprehensive way (the debate on relativism has moved on) but it illuminates the core concepts further.

Case studies

Case Studies 2 and 3 should be of great interest to linguists. Not because they are perfect but because they show the depth of analysis required of even relatively simple concepts.

Philosophical implications

Lakoff is not shy about placing his work in the context of disrruption of the reigning philosophical paradigm of his (and to a significant extent our) day. Chapter 11 goes into more depth on how he understands the ‘objectivist paradigm’. It has been criticised for not representing actual philosophical positions (which he explicitly says he’s not doing) but I think it’s representative of many actual philosophical and other treatments of language and cognition.

This is then elaborated in chapters 12 – 16 and of course in his subsequent book with Mark Johnson Philosophy in the Flesh. I find the positive argument they’re making compelling but it is let down by staying on the surface of the issues they’re criticising.

What to skip

Where Lakoff (and elsewhere Lakoff and Johnson) most open themselves to criticism is their relatively shallow reading of their opponents. Most philosophers don’t engage with this work because they don’t find it speaks their language and when it does, it is easily dismissed as too light.

While I think that the broad critique this book presents of what it calls ‘objectivist approaches’ is correct, I don’t recommend that anyone takes the details too seriously. Lakoff simultaneously gives it too little and too much attention. He argues against very small details but leaves too many gaps.

This means that those who should be engaging with the very core of the work’s contribution fixate on errors and gaps in his criticism and feel free to dismiss the key aspects of what he has to say (much to their detriment).

For example, his critique of situational semantics leaves too many gaps and left him open to successful rejoinders even if he was probably right.

What is missing

While Lakoff engages with cognitive anthropology (and he and Johnson acknowledge their debts in the preface to Metaphors We Live By), he does not reflect the really interesting work in this area. Goffman (shockingly) gets no mention, nor does Victor Turner whose work on liminality is pretty important companion.

There’s also little acknowledgement of work on texts such as that by Halliday and Hasan (although, that was arguably still waiting for its greatest impact in the mid 1980s with the appearance of corpora). But Lakoff and most of the researchers in this areas stay firmly at the level of a clause. But give that my own work is mostly focusing on discourse and text-level phenomena, I would say that.

What to read next

Here are some suggestions for where to go next for elaborations of the key concepts or ideas with relevance to those outlined in the book.

  • Moral politics by Lakoff launched his forays into political work but I think it’s more important as an example of this way of thinking applied for a real purpose. He replaces Idealized Cognitive Models with Frames but shows many great examples of them at work. Even if it falls short as an exhaustive analysis of the issues, it is very important as a methodological contribution of how frames work in real life. I think of it almost as a fourth case study to this book.
  • The Way We Think by Gilles Fauconnier and Mark Turner provides a model of how cognitive models work ‘online’ during the process of speaking. Although, it has made a more direct impact in the field of construction grammar, its importance is still underappreciated outside it. I think of it as an essential companion to the core contribution of this book. Lakoff himself draws on Fauconnier’s earlier work on mental spaces in this book.
  • Work on construction grammar This book was one of the first places where the notion of ‘construction’ in the sense of ‘construction grammar’ was introduced. It has since developed in its own substantive field of study that has been driven by others. I’d say the work of Adele Goldberg is still the best introduction but for my money William Croft’s ‘Radical Construction Grammar’ is the most important. Taylor’s overview of the related ‘Cognitive Grammar’ is also not a bad next read.
  • Work on cognitive semantics There is much to read here. Talmy’s massive 2 volumes of ‘Cognitive Semantics’ are perhaps the most comprehensive but most of the work here happens across various journals. I’m not aware of a single shorter introduction.
  • Philosophy and the Mirror of Nature by Richard Rorty is a book I frankly wish Lakoff had read. Rorty’s taking apart of philosophy’s epistemological imaginings is very much complementary to Lakoff’s critique of ‘objectivism’ but done while engaging deeply with the philosophical issues. While I basically go along with Lakoff’s and later Lakoff and Johnson’s core argument, I can see why it could be more easily dismissed than Rorty. Of course, Rorty’s work is also better known for its reputation than deeply reflected in much of today’s philosophy. Lakoff and Johnson’s essential misunderstanding of Rorty’s contribution and fundamental compatibility with their project in Philosophy in the Flesh is an example of why so many don’t take that aspect of this work seriously. (Although, they are right that both Rorty and Davidson would have been better served by a less impoverished view of meaning and language.)

10 ways in which music is like language and 8 (more important) ways in which it is not

Share

People often talk about music as if it were language. Leonard Bernstein even recorded a series of lectures applying Chomsky’s theory of generative grammar to musicChomsky himself answered a question on this in a not very satisfying manner. Some people can get very exercised over this.

But it seems to me that a playing around with strengths and weakness of the music = language metaphor can help us come to grips with the question a bit better. We can find a number of mappings between music and language but an equal number of mis-mappings. We do not need to go very far into it to see where they are. That does not mean that a deeper investigation into musical properties of language and linguistic properties of music cannot be fruitful. And obviously they are both universal human faculties — but looking for a musical essence in language or a linguistic essence in music is what a metaphor-aware approach to this question is hoping to warn against.

Here are some of the obvious similarities and dissimilarities. (Note: After, I finished my list, I came across a Chomskean comparison by Jackendoff which has a slightly different focus but comes to the same general conclusion.)

Music is like (a) language in that:

  1. It can be described through a system of rules that operate on a limited vocabulary. There are 12 notes (on the Western chromatic scale) that can produce an infinite variety of melodies just purely in their combination further enhanced by their combination with rhythms, tempos and harmonies. (Although I have argued elsewhere that language is not actually much like this, at all.)
  2. It combines small building blocks into larger components that are like words, phrases, sentences and text. In fact, we talk about phrasing in music. But we have things like bars, stanzas, movements, etc.
  3. It is recursively expressive. I can embed little segments of music in others indefinitely. Bach’s variations are an example of this as is jazz improvisation.
  4. It has dual articulation in that smaller segments like scales are organized independently of large segments. This is well-known about language (in certain circles). We articulate sounds into words and words into statements at the same time but also seemingly independently of each other — we know this because we can be good at one and bad at the other (thus dual articulation). In music, producing individual notes (e.g. fingering on piano or guitar, or breathing and embouchure on a trumpet or trombone) is a skill independent of expressing the musical ideas contained in the notes.
  5. It has phraseology and idioms: We speak of musical phrasing but that is more a question of production. But music also has set ways of expressing certain things. There are things like chord progressions or minor or major modes that combine together to express musical meaning. They are more than the sum of their parts and form their own building blocks.
  6. It can cross-reference between compositions (texts): We can hear echoes of folk songs in classical music or we see direct quotations of melodies in jazz. (But it should be noted that this is much less pervasive than in language where co-reference is one of the core components of language.)
  7. It can communicate emotion both segmentally (sequences of notes) and suprasegmentally (expression, emphasis, etc.) In the same way, any phrase (such as Please, sit down.) can be pronounced in many different ways (e.g. in a welcoming, quizzical or threatening manner; deliberately, offhand, formally or casually, etc.).
  8. It has styles, genres and dialects or even accents. We can instantly recognise music recorded in different time periods or in different styles. Even individual artists have particular ways of expressing themselves musically that can be imitated or parodied. YouTube is full of videos of people playing X in the style of Y.
  9. It can be acquired and learned. To reach a ‘fluency’ in music requires an effort that is not dissimilar to the acquisition or learning of language. Part of it happens naturally, simply through exposure to music and part of it is formal — such as learning words to name parts of music, such as notes, chords, harmony, etc. or learning to ‘read sheet music’.
  10. It is culturally conditioned. Different cultures have developed very different takes on what music sounds like. Chinese music does not sound very musical to Western ears due to the very different approach to tonality.

This list makes it seem like music and language are very similar. But the next list of dissimilarities shows that they are also different in fundamental ways.

Music is NOT like (a) language in that:

  1. It cannot be used to directly communicate propositional meaning. I can say, ‘my house is right at the end of the street’ or ‘that will be 50 cents’. But there’s no way to express this kind of content in music. Sometimes music tries to imitate language (Janáček is often cited doing that) but without the words, nobody would know what an opera is about.
  2. It has a radically smaller set of building blocks and rules for their combination than language. There are only 12 intervals/tones in Western Music. But this in itself would not be a problem. There are languages with similarly low numbers of phonemes (distinct sounds). However, there is no intermediary unit of expression equivalent to the word. The rules of melodic composition operate directly on these 12 (or more if we include microtones) tones.
  3. It does not have internal instruments of disambiguation. Being able to repair a conversation that is broken with phrases like ‘what did you say’ or ‘can you say this again’ is a fundamental part of the communicative process. Without them, language would not be nearly as useful. There’s nothing like that to be found within the ‘communicative’ inventory of music that does not rely on verbal or written language in one way or another.
  4. It can only be universally acquired in the most rudimentary sense (i.e. everybody can hum a tune but very few people can play an instrument). Everybody can and does learn their first language. And everybody acquires some musicality as part of their socialisation into their culture. But most of what we would consider musical fluency is learned through some means of instruction. Music (in this strict sense) is more like written than spoken language.
  5. There is a much greater difference between receptive and productive competence. Everybody knows more of their language receptively (passively) than productively. Depending on context, this gap is very small or very large. A person speaking a language of a small community without a lot of specialisation will have a smaller gap between what they can say and what they can understand (although there are many specialised languages even in these contexts). But this difference will be greater when it comes to technical language between a first-year university student and their professor. But in music, everybody can listen passively and receive most of the intended effect while their ability to produce the same music will be severely limited. Many people can hum back a simple tune but they cannot reproduce a full musical performance. Even professional musicians will vary in their ability here.
  6. There is much greater variability in individuals’ ability to produce music beyond the most trivial. It requires effort and study to produce music in a way that we think of as music. We can say that everyone is musical to some degree but most people cannot actually produce anything beyond the simplest of tunes. However, in language, everybody can communicate (even people with impairments) to a significant degree. The differences in competence only appear at the higher levels.
  7. Much more of the production process requires cooperation among individuals. While not a requirement, most music we consume is produced by groups of people. Or if not another person, it requires an instrument made by another person. Language production (at least in spoken form) is primarily by individuals. (Although, there are instances of group production — theatre, speeches, etc.)
  8. It is much more limited in its dialogic potential (i.e. it is most often used for one-way communication between few producers and more recipients or joint co-production of producer/recipients). Language is fundamentally dialogic. Anything that is said can be responded to. Questions can be answered, propositions can be countered or elaborated. Music, on the other hand, is primarily declarative. Of course, metaphorically, we can talk about dialogic elements in music. In jazz or blues, we have call and response, in classical music we have things like counterpoint. We can also talk about members of an orchestra communicating and responding to each others’ musical ideas. But there is no such thing in music as saying: ‘No, I disagree with you about X, instead, I believe Y and this is why.’

Ultimately, this is not all that important. We know what language is and we know what music is. Saying one is or is not like the other in some way won’t change any of that. However, it can help us think more clearly about them and avoid ignoring important aspects and unique properties of both.

Note: What is music

You often hear popularising musicians saying things like ‘Everybody is musical’ or ‘everything is music’. But that is not what most people’s intuitions about music tell them. For most people, music is the sort of thing they hear on the radio. It is the result of composition and production. There are instruments involved and skills and ability to play those instruments. That is the sense in which I’m comparing language to music. When I sing in the shower, I may be engaging in a musical activity but it is not the prototypical meaning of the word. And if we try to build our case for music around that, we’d be leaving other important aspects out.

Language of X

Lists similar to this one could be constructed for other cases where people talk about the ‘language of X’. Programming, gestures, art, or architecture. These would probably end up with lists that overlap with this one in many ways.

There are different cliches in use about many of these domains. So people tend to overestimate the extent to which facial expressions or architecture or art are like language but underestimate the degree to which programming languages are like natural language.

So in some contexts, I’d want to stress the dissimilarities. When people say things like ‘He can express all he needs through dance.’ or ‘90% of all language is nonverbal’, we would want to point to the propositional and dialogic aspects of language that are lacking in these domains.

But in other contexts, we’d want to point to the parallels. For instance, we may want to remind ourselves that programming languages are more like natural language in many (but not all) of the ways, music is like language. They have dialects, phrases and idioms, multiple levels of articulation, recursiveness (duh!), etc.

Note:

This post started life as an answer on Stack Exchange which I then cleaned up for Tumblr. This is an expanded version. It is also posted on Medium. This version has been slightly amended for spelling and punctuation.

What language looks like: Dictionary and grammar are to language what standing on one foot is to running

Share

Background

Sometimes a rather obscure and complex analogy just clicks into place in one’s mind and allows a slightly altered way of thinking that just makes so much sense, it hurts. Like putting glasses on in the morning and the world suddenly snapping into shape.

This happened to me this morning when reading the Notes from Two Scientific Psychologists blog and the post on Do people really not know what running looks like?

It describes the fact that many famous painters (and authors of instructional materials on drawing) did not depict running people correctly. When running, it is natural (and essential) to put forward the arm opposite the leg that’s going forward. But many painters who depict running (including the artist who created the poster for the 1922 Olympics!) do it the wrong way round. Not just the wrong way, the way that is almost impossible to perform. And this has apparently been going for as long depiction has been thing. But it’s not just artists (who could even argue that they have other concerns). What’s more when you ask a modern human being to imitate somebody running in a stationary pose (as somebody did on the website Phoons­) they will almost invariably do it the wrong way round. Why? There are really two separate questions here.

  1. Why don’t the incorrect depictions of running strike most people as odd?
  2. Why don’t we naturally arrange our bodies into the correct stance when asked to imitate running while standing still?

Andrew Wilson (one of the two psychologists) has the perfect answer to question 2:

Asking people to pose as if running is actually asking them to stand on one leg in place, and from their point of view these are two very different things with, potentially, two different solutions. [my emphasis]

And he prefaces that with a crucial point about human behavior:

people’s behaviour is shaped by the demands of the task they are actually solving, and that might not be the task you asked them to do.

Do try this at home, try to imitate a runner standing up, then slowly (mime-like), then speed it up. Standing into the wrong configuration is the natural thing to do. Doing it the ‘right’ way round, is hard. It’s not until I sped up into an actual run that my arms found the opposite motion natural until I could keep track of what was going on any more. I would imagine that this would be the case for most people. In fact, the few pictures I could find of runners arranged standing at the start of the race have most of them also with the ‘wrong’ hand/leg position and they’re not even standing on one leg. (See here and here.)

Which brings us back to the first question. Why does not anybody notice? I personally find it really hard to even identify the wrong static description at a glance. I have to slow down, remember what is correct, then match it to the image. What’s going on. We obviously don’t have any cognitive control over the part of running that controls the movement arms in relation tot he movement of legs. We also don’t have any models or social scripts that pay attention to this sort of thing. It is a matter of conscious effort, a learned behaviour, to recognize these things.

Why is this relevant to language?

If you ask someone to describe a language, they will most likely start telling you about the words and the rules for putting them together. In other words, compiling a dictionary and a grammar. They will say something like: “In Albanian, the word for ‘bread’ is ‘bukë'”. Or they will say something like “English has 1 million words.”, “Czech has no word for training.” or “English has no cases.”

All of these statements reflect a notion of language that has a list of words that looks a little like this:

bread n. = 1. baked good, used for food, 2. metaphor for money, etc.
eat v. = 1. process of ingestion and digestion, 2. metaphor, etc.
people n. plural = human beings

And a grammar that looks a little bit like this.

Sentence = Noun (subj.) + Verb + Noun (obj.)

All of this put together will give us a sentence:

People eat food.

All you need is long enough list of words and enough (but not as many) rules and you got a language.

But as linguists have discovered through not a bit of pain, you don’t have a language. You have something that looks like a language but not something that you can actually speak as a language. It’s very similar to language but it’s not language.

Kind of like the picture of the runner with the arms going in the opposite direction. It looks very much like someone running but it’s not it’s just a picture of them running and the picture is fundamentally wrong. Just not in a way that is at all obvious to most people most of the time.

Why grammars and dictionaries seem like a good portrait of language

So, we can ask the same two questions again.

  1. Why does the stilted representation of language as rules and words not strike most people (incl. Steven Pinker) as odd?
  2. Why don’t we give more realistic examples of language when asked to imitate one?

Let’s start with question 2 again which will also give us a hint as to how to answer question 1.

So why, when asked to give an example of English, am I more likely to give:

John loves Mary.

or

Hello. Thank you. Good bye.

than

Is it cold in here? Could you pass the sugar, please. No no no. I’ll think about it?

It’s because I’m achieving a task that is different from actually speaking the language. When asked to illustrate a language, we’re not communicating anything in the language. So our very posture towards the language changes. We start thinking in equivalencies and left and right sides of the word (word = definition) and building blocks of a sentence. Depending on who we’re speaking to, we’ll choose something very concrete or something immediately useful. We will not think of nuance, speech acts, puns or presupposition.

But the vast majority of our language actions are of the second kind. And many of the examples we give of language are actually good for only one thing: Giving an example of the language. (Such as the famous example from logic ‘A man walks’ which James MacCawley analysed as only being usable in one very remote sense.)

As a result, if we’re given the task of describing language, coming up with something looking like a dictionary and a grammar is the simplest and best way of achieving fullfilling the assignment. If we take a scholarly approach to this task over generations, we end up with something that very much looks like the modern grammars and dictionaries we all know .

The problem is that these don’t really give us “a picture of language”, they give us “a picture of a pose of language” that looks so much like language to our daily perception, that we can’t tell the difference. But in fact, they are exactly the opposite of language looks like.

Now, we’re in much more complex waters than running. Although, I imagine the exact performance of running is in many ways culturally determined, the amount of variation is going to be limited by the very physical nature of the relatively simple task. Language on the other hand, is almost all culture. So, I would expect people in different contexts to give different examples. I read somewhere (can’t track down the reference now) that Indian grammarians tended to give examples of sentences in the imperative. Early Greeks (like Plato) had a much more impoverished view of the sentence than I showed above. And I’m sure there are languages with even more limited metalanguage. However, the general point still stands. The way we tend to think about language is determined by the nature of the task

The key point I’ve repeated over and over (following Michael Hoey) is that grammars and dictionaries are above all texts written in the language. They don’t stand aprt from it. They have their own rules, conventions and inventories of expression. And they are susceptible to the politics and prejudices of their time. Even the OUP. At the same time, they can be very useful tools to developing language skills or dealing with unfamiliar texts. But so does asking a friend or figuring out the meaning in context.

Which brings us to question 1. Why has nobody noticed that language doesn’t quite work that way? The answer is that – just like with running – people have. But only when they try to match the description with something that is right in front of them. Even then, they frequently (and I’m talking about professional linguists like Stephen Pinker here) ignore the discrepancy or ascribe it to a lack of refinement of the descriptions. But most of the time, the tasks that we fulfil with language do not require us to engage the sort of metacognitive aparatus that would direct us to reflect on what’s actually going on.

What does language really look like

So is there a way to have an accurate picture of language? Yes. In fact, we already have it. It’s all of it. We don’t perhaps have all the fine details, but we have enough to see what’s going on – if we look carefully. It’s not like linguists of all stripes have not described pretty much everything that goes on with language in one way or another. The problem is that they try to equate the value of a description to the value of the corresponding model that very often looks like an algorithm amenable to being implemented in a computer program. So, if I describe a phenomenon of language as a linguist, my tendency is to immediately come up with a fancy looking notation that will look like ‘science’. If I can make it ‘mathematical’, all the better. But all of these things are only models. They are ways of achieving a very particular task. Which is to – in one way or another – model language for a particular purpose. Development of AI, writing of pedagogic grammars, compiling word lists, predicting future trends, tracing historical developments, estimating psychological impact, etc. All of these are distinct from actual pure observation of what is going on. Of course, even simple description of what I observe is a task of its own with its own requirements. I have to choose what I notice and decide what I report on. It’s a model of a sort, just like an accurate painting of a runner in motion is just a model (choosing what to emphasize, shadows, background detail, facial expression, etc.) But it’s the task we’re really after: Coming up with as accurate and complete a picture of language as is possible for a collectivity of humans.

People working in construction grammars in the usage-based approach are closest to the task. But they need to talk with people who work on texts, as well, if they really want to start painting a fuller picture.

Language is signs on doors of public restrooms, dirty jokes on TV, mothers speaking to children, politicians making speeches, friends making small talk in the street, newscasters reading the headlines, books sold in bookshops, gestures, teaching ways of communication in the classroom, phone texts, theatre plays, songs, blogs, shopping lists, marketing slogans, etc.

Trying to reduce their portrait to words and rules is just like trying to describe a building by talking about bricks and mortar. They’re necessary and without them nothing would happen. But a building does not look like a collection of bricks and mortar. Nor does knowing how to put a brick to brick and glue them together help in getting a house built. At best, you’d get a knee-high wall. You need a whole of other knowledge and other kinds of strategies of building corners, windows, but also getting a planning permission, digging a foundation, hiring help, etc. All of those are also involved in the edifices we construct with language.

An easy counterargument here would be: That’s all well and good but the job of linguistics is to study the bricks and the mortar and we’ll leave the rest to other disciplines like rhetoric or literature. At least, that’s been Chomsky’s position. But the problem is that even the words and grammar rules don’t actually look like what we think they do. For a start, they’re not arranged in any of the ways in which we’re used to seeing them. But they probably don’t even have the sorts of shapes we think of them in. How do I decide whether I say, “I’m standing in front of the Cathedral” or “The Cathedral is behind me.”? Each of these triggers a very different situation and perspective on exactly the same configuration of reality. And figuring out which is which requires a lot more than just the knowledge of how the sentence is put together. How about novel uses of words that are instantly recognizable like “I sneezed the napkin off the table.” What exactly are all the words and what rules are involved?

Example after example shows us that language does not look very much like that traditional picture we have drawn of it. More and more linguists are looking at language with freshly open eyes but I worry that they may get off task when they’re asked to make a picture what they see.

Where does the metaphor break

Ok, like all metaphors and analogies, even this one must come to an end. The power of a metaphor is not just finding where it fits but also pointing out its limits.

The obvious breaking point here is the level of complexity. Obviously, there’s only one very discretely delineated aspect of what the runners are doing that does not match what’s in the picture. The position of the arms. With language, we’re dealing with many subtle continua.

Also, the notion of the task is taken from a very specific branch of cognitive psychology and it may be inappropriate extending it to areas where tasks take a long time, are collaborative and include a lot of deliberately chosen components as well as automaticity.

But I find it a very powerful metaphor nevertheless. It is not an easy one to explain because both fields are unfamiliar. But I think it’s worth taking the time with it if it opens the eyes of just one more person trying to make a picture of language looks like.