When is balance not better?

So you’ve started your journey into positive reinforcement training. You’re loving it. But some behaviors were fine with negative reinforcement why start them over to train them positively? Everyone in your barn uses traditional or natural horsemanship, you want to stick with some negative reinforcement to fit in. You would like to compete or at least appear normal in public, so you trickle in a bit of pressure. You’re worried about other people who may handle your horse, so you make sure he knows how to give to pressure. You want to prepare your horse for potential emergencies, so they ought to be used to some aversives in their lives. Anyway, life can’t be perfect all the time, so what’s wrong with using some aversives in training?

Well, let’s get into this more deeply. Let’s look at how these too quadrants work to create and motivate behavior.

Positive reinforcement adds an appetitive stimulus (something the learner thinks they want, like, or appreciate) which increases the likelihood this behavior will appear again in the same scenario.  So next time they are faced with this situation, the same antecedents if you prefer, they will be more likely to chose this behavior as it has resulted in good things for them in the past.

While negative reinforcement works as an equal opposite. Something aversive (to the learner, something they dislike, want to avoid, escape from) is removed. But in order to remove it, it must first be added. In a training construct (not a natural one) a handler would add an aversive stimuli, such as a shock, a tap, a smack, a push or press, or repetitive mild stimuli, in order to remove the stimuli when the correct behavioral response is given. So next time the horse is in this scenario (next time these antecedents are present) the horse will be likely to chose this behavior as it has provided the horse relief in the past.

In negative reinforcement the cue is built into the training situation, as the addition of the aversive or a threat of adding the aversive (usually a hand signal or mild version of the aversive – like a gentle squeeze, tap, or press) serves as the cue, the signal to perform the behavior. As well as the reinforcement when the stimulus is relieved. Even if the cue is just a threat of an aversive, we call this a conditioned aversive. A gentle press on the hip may not have been aversive to your horse until that gentle touch began predicting stronger aversives, meaning move your hip away. These conditioned aversives also serve as reinforcement when they are removed.

Meanwhile, with positive reinforcement we often have to add a cue, as we generally use shaping, targeting, or capturing to create a behavior. This means when we have the behavior going well, we need to give it a predictor signal, a conditioned appetitive, which tells the horse which behavior will be reinforced right now. So we use a visual, verbal, situational, or tactile cue. We also need to put the cue on Stimulus Control. This means the correct behavior happens only when we have provided the cue and not when we don’t cue. Negative reinforcement has implied stimulus control, in that most horses aren’t going to offer an avoidance behavior when there is nothing to avoid. But they may offer a behavior seeking appetitive, which is why cue clarity is important.

Ok, so both techniques work equally well to reinforce and maintain behavior and put them on cue. Why not use both together? Use our all the tricks in our bag of training tools? Wouldn’t that make behaviors super strong?

Well let’s take a look at this in a more analytical way…

In the scale above you’ll see the two ends of the scale is the percentage of the likelihood this behavior will happen in this scenario again. You’ll notice that if you add a mild appetitive when positively reinforcing the behavior you are only adding a small percent chance of if the behavior will happen again in the future. Such as reinforcing a behavior with hay pellets, this may work when there are few other options, but this 10% chance may easily be outweighed by stimuli in the environment (a scary sound, delicious grass, or any more valuable stimulus). But each time we reinforce this behavior with a small appetitive reinforcer we will be adding strength to the behavior, adding percentages. The value slowly degrades over time, which is why reviewing old behaviors is important. If we repeat the behavior and repeat the reinforcement regularly and consistently, the behavior will become strong, having a higher and higher percent chance of it being likely to happen. This also means it’s more likely to outweigh stimuli in the environment. For example, they may only be willing to perform well known behaviors when working on grass, because there is such a high value option at their feet, they aren’t likely to choose a low percentage behavior.

We call this concept conditioning. The behavior gets progressively more appetitively conditioned. Becoming not just more enjoyable to the horse, but more likely to happen when we really need them. This is why we don’t just teach behaviors, we maintain them with a strong positive reinforcement history. We can also speed up this process by adding higher value reinforcers. So while my horse may only be willing to do well known behaviors on grass to earn hay pellets – they may be very willing to put in great effort and load onto a trailer instead of grazing, in exchange for a more highly valued reinforcer. So with regular conditioning and with a strong reinforcement value, we can make any behavior impossibly strong.

The same concept applies in an equal opposite with negative reinforcement again. This is to say, when we apply a mild aversive and remove it, it only increases a small percentage of the likelihood of the behavior re-occurring in this scenario. While drilling the behavior, repeating this process over and over again, will increase the percentage of the behavior’s likelihood, the same way in which it did with repeated mild appetitives. This gradually increases the strength of the behavior around distractions or competing stimuli. Using stronger aversives works the same way, to greatly increase the percentage likelihood of the behavior occurring. Just like in the positive scenario, we also need to maintain, repeat, or increase the value of the aversive to maintain the frequency of behavior.

While we had a mild complication with the appetitive behaviors needing us to add an intentional cue, there is also a complication with using negative reinforcement. The complication here is that we can’t relieve the aversive without compliance. If we apply a cue, a mild aversive, and the horse does not perform the correct behavior, if we release the pressure we have effectively taught the horse that behavior isn’t what makes the aversive go away, just wait and it will go away on it’s own. So it’s vital that if a horse does not perform the correct behavior with the mild aversive we need to repeat (which is increasing due to muscle fatigue and irritation) or increase the aversive. Now even worse, should a horse perform the wrong behavior but find relief, you may inadvertently reinforce a dangerous behavior. This is often how aggressive behaviors are unintentionally trained. For example a rider cueing the horse with aversives, finally the horse gets annoyed and bolts or bucks. Now imagine if the rider comes off. The horse has effectively learned that bolting or bucking will remove the entire aversive stimulus (the rider). This cycle of needing to repeat and escalate, repeat and escalate, becomes self desctructive very quickly.

Wow – ok, so both of these scenarios work to increase the frequency of behavior, they both have their pros and cons. Both can become equally strong with proper training, care, and maintenance. So why not use both, will they not add to each other to make them stronger? Some people believe that if you add a threat and a promise you will have a stronger behavior. But let’s look at ourselves personally. Have you ever been threatened to do something you didn’t want to – but then rewarded for it? How did that feel? Good?

A few human examples… You have a crappy roommate, they’re noisy, they leave their mess around, one day you get annoyed and clean up the mess. Your roommate is so happy they buy Chinese food for dinner, sweet! But then you’re stuck cleaning up again. Did the Chinese food for dinner really make it worth the constant annoyances? Your boss nags and nags, driving you crazy, calls you on your time off, begging you to get a project done – you finally do, pissed and fed up with your stupid boss, you drop it on their desk and they give you your next paycheck in an envelope. Does that check feel good? Or does it feel like “ughhhh I just have to deal, I need the stupid money”. Let’s think about this in horse context too, we often think we are feeding treats, bonus food, but for a horse who is designed to be consuming food 24/7, every bite feels like a necessity to their survival, not a bonus or a treat. Imagine fighting with your spouse over who ought to do the dishes, finally you resign and do the dishes, pissed, because you always have to do the dishes! But they come over and give you a hug to say “thank you”. Does that hug make you feel better? Or are you too mad to even take the stupid hug. You don’t want a hug you want them to do their part and not make you do the extra work anyway. How about a spouse who scolds you when you don’t do the laundry just right? But when you present the laundry, sometimes, they are so happy they give you a big hug! Are you happy with that hug? Or just relieved you didn’t get scolded? This uneasiness, this lack of clarity, this not knowing, creates anxiety and a great deal of conflict. If a horse is working to avoid something they don’t like, once they’ve successfully avoided it, adding an appetitive doesn’t make the deal any sweeter, nor add any strength to the behavior – it may only make the appetitive a little less appreciated.

This concept is what we call a poisoned cue. We’ve all felt it. But why does this happen? Notice how the scale is a line and the strength of the aversive and the strength of the appetitive, are on opposite sides. Meaning if you remove a strong aversive, then add an appetitive, you only tip the scale back to neutral. Reducing the likelihood of the behavior happening again. Often horses will even stop taking treats for behaviors when a scenario becomes poisoned. This happens because aversive stimuli and appetitive stimuli both engage different parts of the brain/nervous system – different emotions. More on this concept: https://empoweredequines.com/2019/11/13/poisoned-cues/

In this next section I’ll be discussing the emotional parts of the brain/mind and nervous system of the horse, using research from Jaak Panksepp and many others. The emotional systems will be capitalized in order to differentiate when I’m discussing the system in the body vs. the general use of the term as an emotion.

Negative reinforcement works within the escape/avoidance (SEEKING), FEAR, PANIC, and RAGE systems. This is because when there is an aversive stimuli (no matter how mild) the learner works to avoid it via freeze (this is when they brace against pressure or stop to assess a visual stimuli that may be aversive), fight (RAGE), or flight (FEAR) we use these systems, these hormones that are involved in these systems to teach behaviors that we like. But in nature, when a horse avoids a mild aversive (like bugs that are irritating) they are still seeking relief, avoiding discomfort, and defending their well-being (reducing discomfort and reducing likelihood of disease). Or when they avoid a serious aversive (the flying hoof of a peer horse) they are fleeing in flight, FEAR, avoiding the serious aversive that could compromise their physical health or well-being. Being lame, isolated from peers, eatten alive by bugs, hot, cold, injured, these compromise their ability to survive. So when we say, we are only using mild aversives, the horse isn’t really afraid, the truth is, they would only be responding if they felt they must for the highest likelihood of survival and their best well-being. Negative reinforcement works within the mindset of the maintaining safety.

Positive reinforcement however works within the SEEKING, CARE, and PLAY systems. This is searching out and earning the many needs of life (primary reinforcers) as well as social reinforcement and just having fun. When a horse feels safe, comfortable and socially appropriate, they will engage in play, learn more actively, become more curious, and have an overall better well being. Our positive training can easily mimic the natural versions of these systems. Grazing, mutual grooming, playing with investigation and positive outcome, all happen in nature when the herd is doing well.

The problem is that these systems frequently contradict one another. A horse will not engage in PLAY if their FEAR or RAGE system are heightened, though some play can mimic fear or rage behaviors (this is different). PANIC (concern of losing a peer or separation from a social unit) directly contradicts CARE (the feeling of comfort and love, being accepted in their social unit). As well as SEEKING relief directly contradicts SEEKING something desired. A horse isn’t going to stop and graze or mutual groom when they feel the need to escape from a predator, no matter how desperate they are. These hormones, parts of the brain/mind/body, they work in opposition of each other.

 

Wow, that’s alot to take in. To sum up, the emotional systems in R- and R+ regularly contradict each other creating conflict, confusion, lack of clarity – creating equally sporadic behavior (potentially very dangerous) from the horse. We call this “Posioning” a behavior.

But what about on separate behaviors? Use R- for one behavior and R+?
Sure, this shouldn’t poison the behaviors at all! Because it’s clear which behavior earns which consequence. There is a clear Antecedent-Behavior-Consequence, sequence. There is no conflict or frustration, only straight forward seeking or avoidance.

However there is an influence on your relationship.
Remember conditioning from earlier? How cues becomes conditioned appetitives or aversives? Well remember that everything in a scenario is being conditioned. If everytime I approach my horse we are in a mindset of seeking appetitives, play, and social care, the horse feels good whenever they are with me – I will become a conditioned appetitive. While if everytime I approach my horse they are in avoidance, fear, rage, or panic I will progressively become conditioned as an aversive (I predict aversive situations). Now if we predict both, sometimes appetitive, sometimes aversive, this is how most things in the world are conditioned. We have very few relationships in our lives that are truly appetitive or truly aversive. Only people you really love or people you really hate, everyone else is somewhere in the middle right? Well where do you want to be with your horse? Everyone can answer that independently, while I assume we all want our horses to love us that might not always be true – some people just want a horse to do a job and relationship doesn’t really matter. That’s ok (not my style, but to each their own).

Ok, so now comes an emergency. There are a number of more positive ways to over come an emergency and a number of aversive ways. Both work, some work better than others and sometimes we just don’t have a damn choice! We do what we must for the health and safety of everyone involved. More on this here: https://empoweredequines.com/2019/11/26/dealing-with-emergencies/

But how does our conditioning (or relationship) effect these emergencies? We call this concept an emergency piggy bank. If our relationship is extremely positively conditioned this means I have a lot of money in my relationship bank account. If I’m 100% awesome to my horse. Then something awful happens, a vet emergency, everyone struggles, fights happen, aversives are spilled everywhere! but the next day comes and my horse looks at me… Now I’m not 100% awesome. But maybe now, I’m only 60% awesome, I’ve taken a big deduction from my bank account. Now if I use a mix of appetitives and aversives, my conditioning is mixed, my relationship is 50% appetitive, 50% aversive, now we have our emergency the same as before. While before I was only knocked down to 60%, now I’m knocked down to 10%. Our relationship is tragically damaged. We’ve lost a lot.

I know this is an overwhelming amount to take in. But consider these thoughts as you approach your horse and ask yourself how you really want to be conditioned.