Surfing Uncertainty chapter 4 – Prediction Action Machines

I have only just discovered this book (academic PDF link and purchase link), and I have only just read this chapter, but as well as loving reading every page of it, the author Andy Clark provides a really thorough account of predictive processing theory in the context of movement. The topic excites me because it asks really interesting questions of how the brain works as a whole, and Clark’s discussion is courageous in its consideration of potential pitfalls of predictive processing theory that the framework appears to robustly handle.

I have been familiarising myself over the last few months with the attempts to present a unified theory of brain function; here Clark names it predictive processing theory, but it has elsewhere been called predictive coding theory or active inference. At its heart is the idea that prior information stored in the brain determines and shapes our sensations in the present. Predictions of incoming sensory data arise from top-down connections, with unaccounted sensory information travelling up the hierarchy. In this chapter, there is a focus on applying it to motor control in place of (perhaps) now more traditional computational models of motor control. As I’ll discuss in this chapter summary, there is a huge cross over between the models but also a compelling elegance to how predictive processing accounts for some of the trickier challenges that face other computational models.

The chapter begins with the renowned ‘Why can’t you tickle yourself?’ work by Blakemore, Wolpert and Frith in 1998. They explained this strange phenomenon with the presence of a generative model, or a “forward model of the motor system” which predicts sensory consequences of self-generated movement. Because of this forward model, well-predicted sensory inputs are cancelled out or attenuated, meaning a self-tickle is not as ticklish as when someone else tickles you. The same concept underpins force escalation as well as force overcompensation (in a self-generated condition but not an indirect condition) in a force-matching task (Shergill et al. 2003).

What advantages does a forward model contribute? The use of prior information actually helps overcome the delays that are present throughout the sensorimotor system, such as lags in nerve conduction times for receiving afferent information as well as in muscle responses. There’s also a huge amount of potential incoming information about muscle length, limb position, contact with an object – all of which change rapidly. Using a forward model to predict what these sensations should be and attenuating anything expected helps filter out the less useful.

Clark then claims the forward model account does not sufficiently explain sensory attenuation (the dampening of self-generated sensory consequences) because if predicted information is cancelled out then there should be a complete absence of sensation. Instead, it is dampened/attenuated so this could either be explained with predictions of not-quite-complete accuracy, or with variable precision weighting which is discussed in the next chapter. Precision weighting is actually discussed in earlier streams of the computational forward model based theories (such as Bayesian integration models where prior and sensory signals are combined and weighted based on uncertainty, e.g. Körding and Wolpert, 2004).

More simply, corollary discharges descend and cascade through the hierarchy of the nervous system encoding predictive information for action.

A more interesting problem though is the inverse model. Whereas the forward model predicts the sensory consequences of an action, the inverse model performs some calculation of what movements will perform that action and generate the predicted sensory consequences. Together, both form the ‘internal model’. Having two distinct parts of the generative model is characteristic of the internal model-based framework (though earlier iterations had even more elements to account for the functions of the generative model – see Miall & Wolpert). Whilst effective ways have been found to demonstrate how the inverse model performs its calculation, such as mixed cost functions and the minimum-jerk model, the predictive processing framework dispenses of the inverse model.

The predictive processing framework also dispenses of the efference copy of the motor command, which is required in other theories to provide the forward model with enough information to anticipate sensory consequences. Predictive processing, rather radically but also more subtly than first appears, argues that the prediction itself acts as a motor command. Anticipating the sensory consequences drives the body to move until prediction error is reduced. This also means cost functions are not needed in their original form because you do not need to know the cost or value of a function if you can anticipate a costly or value-related sensation. Instead, cost or value functions, Clark describes, are folded in to the “context-sensitive generative models that simultaneously prescribe recognition and action”.

So motor command efference copies are not needed as the forward model plays a bigger role in directly causing movement rather than just guiding it. Inverse models are also not needed: Clark cites robotocists arguing that cost-function-based solutions are “inflexible and biologically unrealistic”, and advocates of inverse-model-based frameworks also recognise the challenges of ‘paired forward-inverse model’ architectures used to combine these various elements (Franklin & Wolpert, 2011). More simply, corollary discharges descend and cascade through the hierarchy of the nervous system encoding predictive information for action. It recognises the complexity of such a system to be able to deal with complex environments in this way, whilst simplifying the model that describes how it does so. This reminds me of my recent reading into the cybernetic approach to studying the brain which recognises complex systems and does not always seek to understand what is inside the “black box” whilst also attempting to understand how it works. See Adams, Shipp & Friston (2013) for more on how perceptual and motor systems should be considered a single “active inference machine” and how forward models as a corollary discharge can perform more than just provide other elements with sensory predictions.

In summary, neural circuitry predicts sensory consequences of an action. Prediction error then ensues, but is reduced by moving the body until those predicted sequences of sensations are realised. Clark does somewhat consider the biology of the brain that would make all of this plausible. Referenced are:

similarity of motor system and visual system in their downwards connections (Adams et al., 2012)
a well supported role of the cerebellum (for example, see Herzfeld & Shadmehr, 2014, and more recently, Kilteni et al 2020)
a simple motor command can unfold into a complex set of proprioceptive predictions as it cascades down the hierarchy, with motor coordinates achieved by low level reflex arcs quashing prediction error (Friston, 2011)
Adams, Shipp & Friston (2013): inverse model relegated to spinal level by the fact that prior beliefs already encode proprioceptive predictions at high levels

Motor commands are thus replaced by descending proprioceptive predictions, whose origins may lie at the highest (multimodal or meta-modal) levels but whose progressive (context-sensitive) unpacking proceeds all the way to the spinal cord, where it is finally cashed out via classical reflex arcs
Andy Clark, 2016

Overall I am compelled by the simplicity of the model which retains the theoretical power of the complex system that is the brain. I am keen to test the power of the framework in accounting for various phenomena such as sensory attenuation (which needs fleshing out) and even tremor. The framework appears anatomically plausible, and as it is not restricted to individual pathways, it might indeed be useful in accounting for evasive phenomena like physiological and pathological tremor which is hard to pin down to a particular neural structure. Do see Friston’s (2018) comment on the power of predictive processing theory to pre-empt experimental data.

As Clark writes, if the generative model in predictive processing achieves the vast functionality described, then the burden shifts onto how predictions are acquired. There is reference to its success in robotics (Park et al., 2012), which takes inspiration from infantile motor babbling – random-ish movements that will generate prediction error in absence of an accurate prior and actually form a prediction in the process of learning. Experience forms predictions then predictions guide action and perception.

Scribbled diagram in my notes of Clark’s description of the role of reward and pleasure in predictive processing.

There are questions that remain (at least from this chapter). It is not entirely clear how actions are selected – might it be that the movement that best generates the desired sensation is selected? Would this require imagery? Or is the action selected based on the activated predicted consequence, which might be influenced multi-modally or at different hierarchical levels? Reward is considered to fold into implicit sensory expectations as they form consequences of behaviours, rather than causes. Though of course they then cycle round to influence behaviour when our beliefs interact with our environment the next time round. This part is not quite as neat as the rest of it and perhaps needs further working through. I am also interested to understand exactly how representations of predictions change over time in terms of actions being grouped as sequences, and then the direction (up or down) these sequences move in response to learning and expertise, but I think this is enough writing for now!

Responses

La Frances

June 4, 2020 at 8:36 pm

Very well written-engaging and east to follow and understand. A great explanation of predictive processing theory. I’d love to know the role that this plays in tremors. Also, I wonder how emulation through imagery would alter a response/action and what role that could play in movement.

Loved reading this, provoked some new thoughts for me!

1. joshuakearney
  
  June 5, 2020 at 10:31 am
  
  Thanks for reading! I’d love to know the role it plays in tremor too! I’ve long had a suspicion that they’re might be a connection. If you take the premise that all movement is a result of prediction error, then I wonder if we can use this framework to better explain how tremors arise and better implement treatment for pathological forms. I’m intrigued by a finding I came across a couple of years ago that tremor-dominant Parkinson’s patients have less severe symptoms than other people with PD, hinting at the possibility of tremor being a compensation mechanism.
  
  Loads of cool stuff re imagery which I could write loads about. You basically activate similar pathways during imagery compared to actual movement, and there’s strong suggestions we can actually learn movements because we imagine the consequences of a movement as well as the movement itself which creates virtual feedback loop to learn from. The quality of imagery is influenced by various factors, and to most accurately imagine, it helps to have done the movement before. That might explain the mixed success of applied motor imagery studies.

Joshua Kearney