Classical and Operant

Let’s talk conditioning… what’s the difference between classical and operant conditioning? And when do you utilize each in training? Why does it matter?

Classical conditioning is what most people are familiar with, the bell rings, dog anticipates food. Some people know this as Pavlovian conditioning or respondent conditioning. Simple pairing of two stimuli X=Y. Respondent behaviors are behaviors that happen reflexively, involuntarily, such as blinking, increasing heart rate or respiration, wincing, etc… Respondent behaviors happen in “response” to the environment. For example food will trigger salivation. Respondent conditioning is when we add a predictor signal, so while the food causes salivation, the bell predicts the food. Once the bell and food have been sufficiently paired, the bell alone will trigger the response, salivation in anticipation of food.

We can utilize this way of learning through pairing in our training in a wide variety of ways. Most commonly we will condition a sound (like a clicker or whistle) with food, so we can use it in clicker training. We can also condition aversive stimuli, the presence of a whip being paired with the physical sensation of being hit, the whip becomes conditioned as a predictor of pain. This is what we are seeing when a horse picks up speed when the rider is just holding the whip, not using it, because the pairing is understood. We can take any stimuli (sound, sight, touch, etc…) and condition it to predict something else. Once paired consistently the first stimuli takes on the meaning of the second stimuli.

This is important to remember in training. The first stimuli predicts the second one. If you feed, then click, then food is being conditioned to predict the sound. When what we really want is for the sound to predict the food. So we click and then present the food.

We can also take this respondent conditioning one step further. If the horse has an association with a stimuli we can change their response to it through pairing. So a horse may dislike being sprayed with fly spray, but by pairing the fly spray with something they do like (food) we can change how they feel about the spray. So by spraying (starting somewhere the horse is comfortable with the spray) then adding food, the horse learns that the spray predicts food. Soon you have counter conditioned the spray. But keep the order in mind, if food predicts the spray, you may counter condition the food to be something they dislike. We see this often where horses who have been bribed with food learn “the food is a trap”. This is conditioning happening in an unintended direction. The key is the order in which the stimuli are presented.

Operant conditioning is not just giving a stimuli a meaning, it adds a behavioral component. Rather than the environment triggering a reflexive response, this time it triggers a voluntary behavior which will be influenced by the results. The environment triggers a behavior which has a consequence. If the consequence is good the learner will do that behavior more in that scenario again. If the consequence is bad the learner will do that behavior less in that scenario again.

This is where our learning quadrants come into play. These are the 4 possible consequences of a behavior.

Reinforcement is when the consequence a behavior will happen more in the scenario.

This can be done by adding (positive) something the learner likes

Or removing (negative) something the learner dislikes.

Punishment is when the consequence causes the behavior to happen less in the scenario.

This can be done by adding (positive) something the learner dislikes

Or by removing (negative) something the learner likes.

Leave a Reply