Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Jun;37(6):831-852.
doi: 10.1007/s12264-021-00665-0. Epub 2021 Mar 29.

The Role of Dopamine in Associative Learning in Drosophila: An Updated Unified Model

Affiliations
Review

The Role of Dopamine in Associative Learning in Drosophila: An Updated Unified Model

Mohamed Adel et al. Neurosci Bull. 2021 Jun.

Abstract

Learning to associate a positive or negative experience with an unrelated cue after the presentation of a reward or a punishment defines associative learning. The ability to form associative memories has been reported in animal species as complex as humans and as simple as insects and sea slugs. Associative memory has even been reported in tardigrades [1], species that diverged from other animal phyla 500 million years ago. Understanding the mechanisms of memory formation is a fundamental goal of neuroscience research. In this article, we work on resolving the current contradictions between different Drosophila associative memory circuit models and propose an updated version of the circuit model that predicts known memory behaviors that current models do not. Finally, we propose a model for how dopamine may function as a reward prediction error signal in Drosophila, a dopamine function that is well-established in mammals but not in insects [2, 3].

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The classical model of associative learning in Drosophila. A In a naïve fly, the response to an odor is neutral since odor responses in MBONs that drive approach (green) and MBONs that drive avoidance (red) are unaltered. B In aversive training, a punishment such as electrical shock (ES) activates PPL1s. The coincidence between the dopaminergic input to KCs and the activation of the same KCs by odor results in long-term depression (LTD) of the KC → approach.MBON synapse. In future encounters with the odor, the circuit is biased to the output of avoidance MBONs because approach MBON responses are suppressed. This results in aversive memory behavior. C In appetitive training, a reward such as sugar activates PAMs. Coincidence between the dopaminergic input and odor responses in KCs depresses KC → avoidance.MBON synapses resulting in a positively-biased response to the odor and appetitive memory. Green, approach-related synapse; red, avoidance-related synapse; dotted line, multisynaptic odor pathway from antennae to KC projections in the Mushroom Body (MB). MB is shown with the 3 different types of KCs: α′β′ (dark grey), αβ (light grey), and γ (black).
Fig. 2
Fig. 2
Rules of dopamine release in the updated model. A In the classical model, the CS activates a sparse subset of KCs, and the US causes dopamine release from dopaminergic neurons (DANs). Coincidence between KC activation and dopamine produces learning in downstream neurons. B In the ex vivo model, the US does not activate DANs but rather activates KCs through NMDA receptors. Coincidence between the KC activation and the NMDA receptor activation results in carbon monoxide release, activating DANs. The resultant dopamine release is sufficient to cause learning. C In the proposed updated model, the US activates both KCs and DANs. Activation of DANs is not sufficient to cause strong dopamine release. The simultaneous activation of KCs by both the CS and the US gates a positive feedback loop between KCs and DANs, possibly via the carbon monoxide release from KCs. This positive feedback loop increases the dopamine release locally onto KCs. Coincidence between the amplified dopamine signal onto KCs and their activation by the CS distinguishes the KCs responding to the CS, and results in specific learning.
Fig. 3
Fig. 3
Updated model of the Drosophila associative memory circuit. Odors are sparsely encoded in the KCs of the MB. Reward activates PAMs and inhibits PPL1s (which rebound on release, not shown). Similarly, punishment activates PPL1s and inhibits PAMs (followed by a rebound, not shown). Coincidence between the KC activation and dopamine in the MB initiates a CO-dependent positive feedback loop between KCs and DANs, increasing the dopamine release. This amplified dopamine signal, paired with the KC activation, depresses KC → MBON synapses. Two groups of MBONs are shown for both approach and avoidance MBONs: excitatory and inhibitory. (*Excitatory and inhibitory MBON labels represent the MBON effect on DANs rather than any inherent property of the MBON). Recurrent feedback loops maintain the circuit balance between approach and avoidance MBONs. Importantly, the inhibition of DANs in this model is followed by a post-inhibitory rebound. This model relies on a recurrent-loop architecture between KC ⇌ DAN and approach.MBONs ⇌ avoidance.MBONs. Green and red colors indicate a positive (approach) or negative (avoidance) valence, respectively. Blue indicates neural events downstream of CS + US coincidence in the MB.
Fig. 4
Fig. 4
Forward Learning in the updated model. A Odor responses are initially balanced in the naïve fly. B After aversive training, CS + US coincidence increases the gain of PPL1 dopamine release via a CO-dependent positive feedback loop. Coincidence between the amplified dopamine and the KC activation causes LTD of odor-specific KC → approach.MBON synapses. The LTD in approach MBONs releases avoidance MBONs from the inhibition by approach MBONs, resulting in potentiated odor responses in avoidance MBONs and aversive memory formation. C After appetitive training, CS + US coincidence increases the gain of PAM dopamine release via a CO-dependent positive feedback loop. The coincidence between the high dopamine and the KC activation causes LTD of odor-specific KC → avoidance.MBON synapses. The LTD in avoidance MBONs releases approach MBONs from the inhibition by avoidance MBONs, resulting in potentiated odor responses in approach MBONs, and appetitive memory formation.
Fig. 5
Fig. 5
Backward Learning. Upper panels: the learning circuit in different scenarios. Lower panel: theoretical prediction of PAM neuron activity before and during US presentation, and after US termination. A Before training, the associative circuit is balanced between approach and avoidance MBONs. The PAM activity is at baseline. B US presentation without the CS activates PPL1s but is insufficient to initiate the KC ⇌ PPL1 positive feedback loop due to lack of coincidence with the CS. C After termination of the US, a subset of PAMs are released from inhibition and have a post-inhibitory rebound. Presentation of the CS during this time window (grey shaded area) achieves the coincidence between CS + PAM, driving appetitive learning as in Fig. 4C.
Fig. 6
Fig. 6
US re-exposure-induced forgetting. A After aversive learning, the circuit is biased towards avoidance of the learned odor due to LTD of the KC → approach.MBON synapses. Presentation of the US after training activates PPL1s. The unpaired dopamine is sufficient to abolish the LTD of KC → approach.MBONs (dashed black excitatory line). B Approach MBON odor responses return to normal levels after the unpaired dopamine potentiates the depressed synapses, occluding the old memory.
Fig. 7
Fig. 7
Retrieval-induced forgetting. A In a circuit biased towards odor aversion due to LTD of KC → approach.MBON synapses, odor responses in avoidance MBONs are released from their inhibition by approach MBONs. This potentiation in MBON responses drives the activation of PAMs. This allows for a coincidence between the odor presentation and the PAM activation, causing LTD of the KC → avoidance.MBON synapses. C The effects of this LTD of KC → avoidance.MBONs balances out the LTD of KC → approach.MBONs and occludes the aversive memory behavior. B Opposite to (A), appetitive training causes LTD of KC → avoidance.MBON synapses which releases approach MBONs from their inhibition by avoidance MBONs. This in turn drives the activation of PPL1s. Therefore, the unpaired presentation of the CS after appetitive training allows the coincidence between an odor and PPL1 activation and results in depressing KC → approach.MBON synapses. C This newly-formed LTD of KC → approach.MBONs balances out the LTD of KC → avoidance.MBONs and occludes the memory behavior. The dashed blue line represents the new LTD that forms after unpaired CS presentation.
Fig. 8
Fig. 8
Illustration of the second-order conditioning with a numerical example. In this example, we arbitrarily assume a 50% release of available PPL1 and PAM weight due to the suppression of MBON11 after each step. A In a natural situation, both PAM and PPL1 are completely suspended by the MBON11 springs. B Cutting all the springs (similar to silencing of approach MBONs) results in the release of both balls. PPL1 has a more significant impact in this case because it is heavier. C Aversive training results in a biased system where PPL1 is partially released from inhibition (50% of the ball weight is applied to the seesaw). D Re-exposure to the unpaired CS when MBON11 is suppressed releases an extra 50% of PPL1’s available weight, and 50% of PAM’s full weight. Note that the difference between applied weights (Δ) is less after the unpaired CS exposure than after the initial training. E Repeating unpaired exposure to the CS applies an extra 50% of the available weight of each ball; the effect of the extra release of PAMs is more significant than that of the extra release of PPL1s because of the greater available weight of the PAM ball, hence the PPL1-induced aversive memory is gradually abolished. Note that with every repetition of the unpaired CS exposure, the difference in applied weights between PPL1 and PAM gets smaller (Δ). Key: springs represent the direct inhibition of approach MBONs of PPL1s and the indirect inhibition of the same approach MBONs on PAMs through inhibition of avoidance MBONs. Grey bars represent the system state in the previous panel. Light-grey shapes represent the available weights. Colored sections represent the applied weights.
Fig. 9
Fig. 9
Trace Memory Formation. A CS information is encoded by activity in KCs and is prolonged due to a recurrent loop between KCs and DANs. This imprint degrades gradually over time. B Delivering an electric shock (US) after CS termination but before degradation of the prolonged activity of KCs allows for a coincidence between the CS and US signal encoded by PPL1s, which causes LTD of KC → approach.MBONs, thus forming an aversive memory. Similarly, the CS imprint in the KCs is prolonged by a reciprocal synapse with PAMs (not shown). C Hypothetical CS-induced activity of KCs before and after CS presentation showing the time window in which US presentation achieves learning either without adding the KC ⇌ DAN recurrent loop (light grey) or after adding this loop (dark grey).
Fig. 10
Fig. 10
A mechanism for memory transfer. Similar to the seesaw model shown in Fig. 8, the system is balanced before training (A). Aversive training with odor A applies 50% of PPL1’s weight on the circuit. This is represented by a partial release of the PPL1 ball (B). However, odor B responses are still balanced because the synapses between the KCs responding to odor B and approach MBONs are not depressed (C). For odor A, as described in Fig. 8, 50% of PPL1 weight is already applied to the seesaw. Therefore, presenting odor A with odor B releases 50% of the full PAM weight and an extra 50% of PPL1 available weight. This results in an added stronger impact of PAMs on the circuit and thus a weakening of odor A’s aversive memory; this is signified by the smaller (Δ) in (D). Presenting odor A with odor B allows odor B to utilize the suppression of approach MBONs due to aversive learning to odor A. However, for the novel odor B, the system is still balanced. Therefore, 50% of the full weight of both PPL1 and PAM is released. This results in a stronger impact of PPL1 than of PAM, hence the formation of a weak aversive memory for odor B (E). Key: springs represent the direct inhibition of approach MBONs of PPL1s and the indirect inhibition of the same approach MBONs on PAMs through inhibition of avoidance MBONs. Light-grey shapes represent the available weights. Colored sections represent the applied weights.
Fig. 11
Fig. 11
Reward prediction error with two DAN components. A In a naïve fly, evoked responses in approach and avoidance MBONs by KC activation are balanced. Reciprocal inhibition between approach and avoidance MBONs maintains this balance. B Activation of optimistic neurons (PPL1; not shown) encodes a negative reward prediction error component (–d) that depresses the KC → approach.MBON synapse. This results in weaker responses in approach MBONs, hence weaker inhibition of avoidance MBONs. In turn, responses in avoidance MBONs, and inhibition of approach MBONs by avoidance MBONs, are potentiated. C Activation of pessimistic neurons (PAM; not shown) encodes a positive reward prediction error component (+d) that depresses the KC → avoidance.MBON synapse. This results in the weakening of avoidance MBON responses, and, consequently, weakening of the avoidance.MBON → approach.MBON inhibitory synapse. In turn, approach MBON responses are potentiated; thus, the depressed responses in approach MBONs that were encoded by the (–d) component are abolished by the (+d) component and return to baseline levels. No learning occurs when both components are equally activated.
Fig. 12
Fig. 12
Simplified update of the associative memory circuit model showing a mechanism for reward prediction error. This figure builds off of the circuit mechanism described in Fig. 3. Here, we propose that PPL1 and PAM dopaminergic release encodes negative (–d) and positive (+d) components of the prediction error, and that the mean of the two components represents the deterministic prediction error for CS choice. Further, the model illustrates a circuit mechanism for how the two prediction error components interact to produce the correct behavior. The (–d) component depresses KC → approach.MBON synapses (solid blue line) and indirectly potentiates KC → avoidance.MBON synapses (dashed blue line), and vice versa in the case of the (+d) prediction error component. This allows the downstream synaptic changes to occur only when there is a negative or positive bias between the two prediction error components. This means that learning takes place only when the mean prediction error ≠ 0

References

    1. Zhou S, DeFranco JP, Blaha NT, Dwivedy P, Culver A, Nallamala H, et al. Aversive conditioning in the tardigrade, Dactylobiotus dispar. J Exp Psychol Anim Learn Cogn. 2019;45:405–412. - PMC - PubMed
    1. Dylla KV, Raiser G, Galizia CG, Szyszka P. Trace conditioning in Drosophila induces associative plasticity in mushroom body Kenyon cells and dopaminergic neurons. Front Neural Circuits. 2017;11:1–14. - PMC - PubMed
    1. Riemensperger T, Völler T, Stock P, Buchner E, Fiala A. Punishment prediction by dopaminergic neurons in Drosophila. Curr Biol. 2005;15:1953–1960. - PubMed
    1. Heisenberg M. Mushroom body memoir: from maps to models. Nat Rev Neurosci. 2003;4:266–275. - PubMed
    1. Keene AC, Waddell S. Drosophila olfactory memory: single genes to complex neural circuits. Nat Rev Neurosci. 2007;8:341–354. - PubMed

LinkOut - more resources