THE PURPOSE OF USING ACTIVATION FUNCTIONS IN NEURONS OF A NEURAL NETWORK
IF YOU HAVE SEEN A NEURAL NETWORK YOU MUST HAVE CAME ACROSS THE TERM ACTIVATION FUNCTION . THE TERM IS ANALOGOUS TO THE “ACTIVITY ” OF BRAIN NEURONS IN RESPONSE TO A CERTAIN STIMULUS . DIFFERENT PARTS OF THE BRAIN ARE RESERVED FOR DIFFERENT FUNCTIONS . AND BY RESERVED WE MEAN THAT THE NEURONS IN DIFFERENT PARTS GET “ACTIVATED” TO A CERTAIN POTENTIAL DEPENDING ON THE TYPE OF STIMULUS , EXAMPLE VISUALS ,SMELLS, TOUCH , FEAR OR SOUND.
BEFORE TALKING ABOUT ACTIVATION FUNCTIONS LET US BRIEFLY DISCUSS A CERTAIN FACTOR IMPORTANT IN MAKING PREDICTIONS . SUPPOSE YOU ARE ASKED WHETHER A MOVIE IS GOOD OR NOT . AND THE CONSTRAINT IS THAT YOU CAN ONLY REPLY IN A YES OR NO. WELL FOR CERTAIN MOVIES THAT YOU EITHER LOVE OR DETEST THIS APPROACH WOULD WORK . BUT CERTAIN CASES SUCH OPTIONS ARE NOT ENOUGH TO EXPLAIN WHETHER A MOVIE IS WATCHABLE OR NOT . BUT IF I ASK YOU TO RATE THE MOVIE ANYTHING BETWEEN 1 TO TEN , YOU CAN REPLY WITH A 7 FOR MOVIES THAT WERE GOOD ENOUGH AND 5 FOR SOMETHING YOU CAN’T DECIDE .
EVEN BETTER WOULD BE A RATING BETWEEN 0 TO 1 , THIS WOULD ALLOW US TO VIEW IT AS A PROBABILITY THAT WHETHER YOU LIKED A MOVIE OR NOT . ANOTHER DISADVANTAGE OF YES/NO APPROACH WAS THAT IT MAKES DECISIONS “TOO HARSH” ,FOR EXAMPLE CONSIDER 2 MOVIES WITH RATINGS 4.99 AND 5.01 . THE YES/NO APPROACH WOULD TAG ONE MOVIE AS LIKED AND OTHER AS DISLIKED . EVEN THOUGH IN REALITY THE USER HAD VERY SIMILAR LIKING TOWARDS THE MOVIE .
THE PURPOSE OF ACTIVATION FUNCTIONS
THE ACTIVATION FUNCTIONS ,PRESENT AT THE NEURONS OF THE NEURAL NETWORK SOLVES THE ABOVE PROBLEMS , IT MAKES THE OUTPUT A DISTRIBUTION AND HENCE SHOWS HOW “ACTIVATED” A CERTAIN NEURON IS . ALSO IT DOES THE JOB OF SCALING , SO THAT ALL THE NEURONS GIVE AN OUTPUT WITHIN A GIVEN RANGE .THERE ARE MANY FUNCTIONS THAT ARE USED AS ACTIVATION FUNCTIONS. LETS HAVE A LOOK AT 2 ACTIVATION FUNCTIONS THAT ARE USED A LOT .
THE SIGMOID FUNCTION

THE ABOVE IMAGE SHOWS HOW
- THE RANGE OF CONTINUOUS VALUES IS SCALED DOWN TO A RANGE BETWEEN ZERO AND ONE
- THE OUTPUT INSTEAD OF TRANSITIONING AS A STEP FUNCTION (YES /NO) IS GIVING A PROBABILITY DISTRIBUTION . THE CURVE IF SHIFTED AT X=5 , IT WILL BE THE CASE OF OUR MOVIE EXAMPLE WE TALKED ABOUT EARLIER.
BUT AS YOU CAN SEE THERE ARE TWO REGIONS WHERE THE GRADIENTS START TO VANISH , THEY RESULT IN WHAT IS KNOWN AS THE PROBLEM OF VANISHING GRADIENTS . TO DEAL WITH IT , ANOTHER ACTIVATION FUNCTION THAT IS FAMOUSLY USED IS KNOWN AS THE RELU FUNCTION. BELOW YOU CAN SEE VARIOUS FUNCTIONS THAT ARE USED AS ACTIVATION FUNCTIONS . THE ReLU ACTIVATION FUNCTION IS READILY USES AS IT HELPS IN AVOIDING THE PROBLEM OF VANISHING GRADIENTS DURING BACKPROPOGATION .

HOW ARE ACTIVATION FUNCTIONS APPLIED IN A NEURAL NETWORK
ACTIVATION FUNCTIONS ARE PRESENT AT EVERY NEURON .LETS LOOK AT A SINGLE NEURON AND TRY TO UNDERSTAND THE WORKING . AS YOU CAN SEE BELOW HOW THE NET INPUT FUNCTIONS ADDED WITH BIAS IS THEN PASSED THROUGH THE SIGMOID ACTIVATION FUNCTION TO PRODUCE A SCALED OUTPUT BETWEEN 0 AND 1. !!!

Add a Comment
You must be logged in to post a comment