Goal-Directed and Habitual: Some Evidence

According to the dual-process theory, instrumental actions can be a consequence of both goal-directed processes and habitual processes. So far we have mainly relied on testimony for this key premise. It’s now time to consider evidence for it.

youtube
stream
slides only

This recording is also available on stream (no ads; search enabled). Or you can view just the slides (no audio or video). You should not watch the recording this year, it’s all happening live (advice).

If the video isn’t working you could also watch it on youtube. Or you can view just the slides (no audio or video). You should not watch the recording this year, it’s all happening live (advice).

If the slides are not working, or you prefer them full screen, please try this link.

The recording is available on stream and youtube.

Notes

This is an optional section. We did not cover it in lectures. If you completed the reading for Seminar 1, you have already encountered the evidence here (although perhaps not all of the details).

Until The Minor Puzzle about Habitual Processes we had not encountered any evidence at all for the dual-process theory of instrumental action. What evidence supports this theory?

The section introduces three sources of evidence:

cognitive load (via stress) - Schwabe & Wolf (2010)^[1]

representation of contingency - Klossek, Yu, & Dickinson (2011)

neurophysiology - Dickinson (2016)

If you have difficulty with this (perhaps you are new to psychology, or perhaps you just struggle to follow the lecturer), please consider just the first of these.

It would be much better to have a firm understanding of Schwabe & Wolf (2010) than to have a sense of what each of the three sources of evidence involves.

Speed vs flexibility

In the lecture I justify some of the predictions tested with the consideration that any broadly cognitive process must make a trade-off between speed and flexibility. This idea is further developed by Daw, Niv, & Dayan (2005, p. 1705) who contrast the use of cached values (which is fast but insensitive to rapid changes in the environment) with values computed on the fly (which may demand time and effort but allows more flexibility).

In essence, the idea is that the goal-directed process involves searching through potential actions, predicting their likely consequences and anticipating how valuable (or not) those consequences would be. This ‘poses severe demands on computation and memory and rapidly becomes intractable with growing complexity.’ (Wunderlich, Dayan, & Dolan, 2012, p. 786). By contrast, the habitual process is much less demanding as it does not even require memory of the consequences of actions. But there is a trade-off: in return for being less demanding, the habitual process is unreliable in a rapidly changing environment or where there is insufficient learning.

Ask a Question

Your question will normally be answered in the question session of the next lecture.

More information about asking questions.

Glossary

devaluation : To devalue some food (or video clip, or any other thing) is to reduce its value, for example by allowing the agent to satiete themselves on it or by causing them to associate it with an uncomfortable event such as an electric shock or mild illness.

dual-process theory of instrumental action : Instrumental action ‘is controlled by two dissociable processes: a goal-directed and an habitual process’ (Dickinson, 2016, p. 177). (See instrumental action.)

extinction : In some experiments, there is a phase (usually following instrumental training and an intervention such as devaluation) during which the subject encounters the training scenario exactly as it was (same stimuli, same action possibilities) but the actions produce no revant outcomes. In this extinction phase, there is no reward (nor punishment). (It is called ‘extinction’ because in many cases not rewarding (or punishing) the actions will eventually extinguish the stimulus--action links.)

goal-directed process : A process which involves ‘a representation of the causal relationship between the action and outcome and a representation of the current incentive value, or utility, of the outcome’ and which influences an action ‘in a way that rationalizes the action as instrumental for attaining the goal’ (Dickinson, 2016, p. 177).

habitual process : A process underpinning some instrumental actions which obeys Thorndyke’s Law of Effect: ‘The presentation of an effective [=rewarding] outcome following an action [...] reinforces a connection between the stimuli present when the action is performed and the action itself so that subsequent presentations of these stimuli elicit the [...] action as a response’ (Dickinson, 1994, p. 48). (Interesting complication which you can safely ignore: there is probably much more to say about under what conditions the stimulus–action connection is strengthened; e.g. Thrailkill, Trask, Vidal, Alcalá, & Bouton, 2018.)

instrumental action : An action is instrumental if it happens in order to bring about an outcome, as when you press a lever in order to obtain food. (In this case, obtaining food is the outcome, lever pressing is the action, and the action is instrumental because it occurs in order to bring it about that you obtain food.)
You may encounter variations on this definition of instrumental in the literature. For instance, Dickinson (2016, p. 177) characterises instrumental actions differently: in place of the teleological ‘in order to bring about an outcome’, he stipulates that an instrumental action is one that is ‘controlled by the contingency between’ the action and an outcome. And de Wit & Dickinson (2009, p. 464) stipulate that ‘instrumental actions are learned’.

References

Buabang, E. K., Boddez, Y., Wolf, O. T., & Moors, A. (2023). The role of goal-directed and habitual processes in food consumption under stress after outcome devaluation with taste aversion. Behavioral Neuroscience, 137(1), 1–14. https://doi.org/10.1037/bne0000439

Daw, N. D., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8(12), 1704–1711. https://doi.org/10.1038/nn1560

de Wit, S., & Dickinson, A. (2009). Associative theories of goal-directed behaviour: A case for animalhuman translational models. Psychological Research PRPF, 73(4), 463–476. https://doi.org/10.1007/s00426-009-0230-6

Dickinson, A. (1994). Instrumental conditioning. In N. Mackintosh (Ed.), Animal learning and cognition. London: Academic Press.

Dickinson, A. (2016). Instrumental conditioning revisited: Updating dual-process theory. In J. B. Trobalon & V. D. Chamizo (Eds.), Associative learning and cognition (Vol. 51, pp. 177–195). Edicions Universitat Barcelona.

Dickinson, A., & Pérez, O. D. (2018). Actions and Habits: Psychological Issues in Dual-System Theory. In R. Morris, A. Bornstein, & A. Shenhav (Eds.), Goal-Directed Decision Making (pp. 1–25). Academic Press. https://doi.org/10.1016/B978-0-12-812098-9.00001-2

Klossek, U. M. H., Yu, S., & Dickinson, A. (2011). Choice and goal-directed behavior in preschool children. Learning & Behavior, 39(4), 350–357. https://doi.org/10.3758/s13420-011-0030-x

Schwabe, L., & Wolf, O. T. (2010). Socially evaluated cold pressor stress after instrumental learning favors habits over goal-directed action. Psychoneuroendocrinology, 35(7), 977–986. https://doi.org/10.1016/j.psyneuen.2009.12.010

Thrailkill, E. A., Trask, S., Vidal, P., Alcalá, J. A., & Bouton, M. E. (2018). Stimulus control of actions and habits: A role for reinforcer predictability and attention in the development of habitual behavior. Journal of Experimental Psychology: Animal Learning and Cognition, 44, 370–384. https://doi.org/10.1037/xan0000188

Wunderlich, K., Dayan, P., & Dolan, R. J. (2012). Mapping value based planning and extensively trained choice in the human brain. Nature Neuroscience, 15(5), 786–791. https://doi.org/10.1038/nn.3068

Endnotes

Note that Buabang, Boddez, Wolf, & Moors (2023) report a failed replication of this finding. If you rely on Schwabe & Wolf (2010), it would be good to consider whether this failed replication should undermine confidence in the original result. My own view is that it should not. This is because whereas Schwabe & Wolf (2010) used sateity to devalue, Buabang et al. (2023) used ‘Tween 20 (Polysorbate 20), a colorless and odorless substance that creates a bad taste’. As the authors note, this creates an aversion to the food. But there is a distinction between a change in desire for a food and an aversion to it. We would expect habitual behaviours to be influenced by change in aversion but not by a change in desire (see Preference vs Aversion: A Dissociation). ↩︎