научная статья по теме SPECIFIC ROLE OF DOPAMINE IN STRIATUM DURING INSTRUMENTAL LEARNING Биология

Текст научной статьи на тему «SPECIFIC ROLE OF DOPAMINE IN STRIATUM DURING INSTRUMENTAL LEARNING»

ЖУРНАЛ ВЫСШЕЙ НЕРВНОЙ ДЕЯТЕЛЬНОСТИ, 2014, том 64, № 3, с. 251-254

ОБЗОРЫ, ТЕОРЕТИЧЕСКИЕ СТАТЬИ

УДК 612.826

Specific Role of Dopamine in Striatum during Instrumental Learning

© 2014 N. Yu. Ivlieva, D. A. Ivliev

Institute of Higher Nervous Activity and Neurophysiology, Russian Academy of Sciences, Moscow,

e-mail: nivlieva@mail.ru Поступила в редакцию 03.04.2013 г. Принята в печать 16.12.2013 г.

Research of dopamine's role in behavior is seemingly in a state of permanent controversy over all major topics. The notion of 'prediction error' is a central component of current reward leaning models, but there are many caveats and contradictions in supporting data. In this paper we propose that the same dopamine signal can promote action and reinforce this action, and outline novel model of reward learning in which dopamine provides a kind of teaching signal with DA release starting well before and persisting beyond the to-be reinforced action. The post-response signal that provides the true excitatory drive for LTP comes from intralaminar thalamus. The main part of hypothesized mechanism constitutes the direct pathway of striatal projection neurons and there are reasons to believe that the indirect pathway has the essential possibility to modulate the direct pathway, thus providing behavioral flexibility.

Keywords: dopamine, striatum, intralaminar thalamus, D1-, D2-receptors, instrumental learning, reward, reinforcement, model.

СПЕЦИФИЧЕСКАЯ РОЛЬ ДОФАМИНА В СТРИАТУМЕ ПРИ ИНСТРУМЕНТАЛЬНОМ НАУЧЕНИИ

Н. Ю. Ивлиева, Д. А. Ивлиев

Институт высшей нервной деятельности и нейрофизиологии РАН, Москва,

e-mail: nivlieva@mail.ru

Роль дофамина в процессах научения — предмет непрерывных дискуссий. Доминирует представление о том, что фазная активация дофаминергических нейронов среднего мозга кодирует ошибку в предсказании вознаграждения, однако накопилось много данных, противоречащих этому мнению. Мы предлагаем модель инструментального научения, которая допускает, что один и тот же дофаминовый сигнал в стриатуме обеспечивает неспецифическую активацию двигательной системы для выполнения движения и организует процессы подкрепления. Главным звеном механизма, обеспечивающего такую возможность, является система D1- и прессирующих проекционных нейронов стриатума. Выброс дофамина в стриатуме предшествует инструментальному движению и сопровождает его, а сигналом успешности конкретного движения служит активация эффективных возбудительных входов из интраламинарных ядер таламуса. Сочетание этих событий является условием развития длительной потенциации входов к нейронам прямого пути стриатума из соответствующих двигательных областей коры, а нейроны непрямого пути обладают возможностями существенной модуляции нейронов прямого пути.

Ключевые слова: дофамин, стриатум, интраламинарные ядра таламуса, инструментальное научение, подкрепление, модель, D1-, D2-рецепторы.

DOI: 10.7868/S004446771403006X

Since enormous details of the functions of dopaminergic system have been discovered but those functions are still rather poorly understood. In this paper we propose that the same dopamine (DA) signal can promote action and

reinforce this action, and outline novel model of reward learning in which dopamine provides a kind of teaching signal with DA release starting well before and persisting beyond the to-be reinforced action.

252

IVLIEVA, IVLIEV

Ineffective movement

Effective movement

D1-SPN

Intralaminar thalamus

Premotor/ Motor cortex

'Up"-state

LTP

J I

Л

л

Л__П__П.

Dopamine

Door

Movement

Shut

Opened

Л__il

M

A diagram illustrating hypothesized learning mechanism that incorporates cortical and intralaminar thalamic inputs to Dl-expressing striatal projection neurons (D1-SPNs). Thorndike's cat confined in a cage makes any different movements (m 1 ...mn) until it performs successful movement (M) which opens the door. The first and the second (or any repeated) performance of one of ineffective ("mn") and of the effective movement ("M") are shown. All movements are performed when the DA level in striatum is high. Ineffective movement is accompanied by cortical inputs to Dl-expressing SPNs activation, but this activation is not enough to drive transition to the "up"-state (on the left). Successful movement is followed by additional activation of intralaminar thalamic inputs that induces "up"-state transition. In its turn this leads to LTP induction in corticostriatal synapses and, as result, to subsequent effective movement facilitation (on the right). Dashed line indicates threshold level for "up"-state transition. LTP — long-term potentiation.

m

m

n

n

Research of DA's role in behavior is seemingly in a state of permanent controversy over all major topics. The different views of DA's precise contribution can be classed into three standpoints: 1) DA itself induces pleasure [Wise, Bozart, 1985], 2) DA neurons report "prediction errors" thus providing the universal teaching signal that induces learning [Schultz, 2002], 3) DA mediates motivation [Berridge, 2007] and/or energizes behavior [Salamone et al., 2007].

Each standpoint is based on extensive experimental data, and it is believed that they are not mutually exclusive [Beeler, 2012], but the conceptual contrast of motivation and reinforcement is obvious. Moreover, that activation of the same brain substratum should be both reinforcing and drive inducing is paradoxical [Wise, 2012]. Nevertheless we assume that DA may participate simultaneously in such non-overlapping in time processes as motivation, preceding the action, and reinforcement, following the action. A key question, in our view, is what behavioral events corre-

spond to the dopaminergic system activation. These data are numerous and quite contradictory. Often this is due to the differences in behavioral tasks, training procedures, duration of learning, etc. Studies conducted in unrestrained animals reveal patterns that relate to naturalistic motivated behaviors, such as the animal's movement to obtain rewards or to avoid punishment. These studies demonstrate that dopaminergic system activation precedes the initiation of instrumental movement [Roitman et al., 2004; HBrneBa, 2010; Puryear et al., 2010; Cacciapaglia et al., 2011; Oleson et al., 2012; Wassum et al., 2012].

At the core of proposed mechanism is the dynamics of DA concentration in striatum. We assume that at the initial stage of learning a considerable increase in DA level in striatum is a necessary condition for the execution of goal-directed actions. The more intense the movements should be [Salamone et al., 2007], and the more uncertain the consequences of these actions [de Lafu-ente, Romo, 2011] (that is the case in the initial

SPECIFIC ROLE OF DOPAMINE IN STRIATUM

253

period of learning), the higher the activity of dopaminergic system accompanying these movements in the striatum.

The main part of hypothesized mechanism may constitute the system of striatal projection neurons (SPNs). These neurons represent the vast majority of striatal neurons. They can be classified into striatonigral (direct pathway) and striatopal-lidal (indirect pathway) subtypes on the basis of their axonal projections to the substantia nigra pars reticulate (SNpr) or the globus pallidus. SPNs receive excitatory inputs from cortex and intralaminar thalamus [Kreitzer, 2009]. Critical features of SPNs in this context are:

1) almost complete segregation in most parts of the striatum of D1- and D2-receptors on neurons of direct and indirect pathways, respectively [Gerfen, Surmeier, 2011];

2) significant difference in excitability of the D1- and D2-expressing SPNs [Gerfen, Surmeier, 2011];

3) considerable difference of affinity to DA of D1- and D2-receptors [Neve, Neve, 1997; Richfield, 1989];

4) different, in many ways opposite, effects of DA on the excitability of D1- and D2-expressing projection neurons [Gerfen, Surmeier, 2011; Kreitzer, 2009];

5) the opposite effects of high and low concentrations of DA on the sign of synaptic plasticity of the direct and indirect pathways [Shen et al., 2008].

Substantially lower excitability of striatal D1-expressing SPNs compared to D2-expressing neurons, as well as increasing their excitability in "up"-state under the influence of DA make these neurons the best candidates to "capture" of the motor patterns preceding the reward acquisition.

Let us consider the hypothesized mechanism in Thorndike's cat situation. In the beginning of training the animal performs a sequence of different movements from its motor repertoire to get freedom or food; when cat succeeds in opening the box-door by mean of one of these movements, it receives freedom and access to food [Thorndike, 1911]. And it seems these movements to be performed when the concentration of DA in the striatum is high. Since it is assumed two possible behavioral situation.

1. While cat performs any goal-directed movement ("mn" in fig.), cortical inputs innervating direct pathway neurons convey preferentially motor planning information and to lesser extend the ef-

ference-copy signal to striatum [Reiner et al., 2010]. At the time of these inputs activation the probability of direct pathway neurons spiking is low, because they are inherently less excitable [Gerfen, Surmeier, 2011] and because their inputs are relatively ineffective in producing postsynaptic depolarization [Reiner et al., 2010]. In addition, DA impedes transition of D1-ex-pressing SPNs to the "up"-state, since D1-recep-tor activation acts as a filter to limit "up"-state transition to periods of significant excitatory drive [Kreitzer, 2009]. So, during ineffective movements activation of D1-expressing SPNs and subsequent induction of any changes in synaptic efficacy are unlikely (fig., left).

In contrast, neurons of indirect pathway seem more likely to be activated during any current movement [Reiner et al., 2010], however, in "up"-state DA inhibits their activity [Gerfen, Surmeier, 2011; Kreitzer, 2009

Для дальнейшего прочтения статьи необходимо приобрести полный текст. Статьи высылаются в формате PDF на указанную при оплате почту. Время доставки составляет менее 10 минут. Стоимость одной статьи — 150 рублей.

Показать целиком