HOW MAJOR TECH INDUSTRY LEADER TESLA IS INVESTING HUGE IN THE AI SECTOR ALMOST ANY SCIENCE FICTION MOVIE RELEASED IN THE LAST 2 DECADES HAS ITS PLOT REVOLVING AROUND ARTIFICIAL INTELLIGENCE. IN THE NEXT 2 DECADES WE MIGHT SEE THESE PLOTS TURNING INTO REALITY, MOSTLY IN GOOD SENSE. ALL CREDITS GO TO THE HUGE TECH […]
MATHEMATICS REQUIRED FOR ML / DL /AI ?
WHAT MATHEMATICAL CONCEPTS YOU NEED TO HAVE BEFORE TAKING UP COURSES ON ML /DL /AI THERE ARE PREDEFINED LIBRARIES IN PYTHON WHICH MAKES CREATING MODELS SUPER EASY FOR ANYONE .FOR A BETTER UNDERSTANDING OF WHATS ACTUALLY HAPPENING BEHIND THE SCENES YOU NEED TO HAVE COMMAND OVER THE FOLLOWING MATHEMATICAL DOMAINS. SINCE “DATA SCIENCE” IN ITSELF […]
HOW DATA SET SIZE AFFECTS AI PERFORMANCE
HOW NEURAL NETWORKS WILL RULE 2021-2030 RESEARCH ON AI AND NEURAL NETWORKS BEGAN WAY BACK IN THE MID 1900’S. SO WHY IS IT THAT CERTAIN AI BASED TECHNOLOGIES HAVE SEEN EVOLUTION ONLY IN THE LAST DECADE . THE ANSWER TO THIS QUESTION LIES IN THE VERY NATURE OF NEURAL NETWORKS. TO UNDERSTAND THIS BETTER LETS […]
DEEP LEARNING IN GAMES -GPUs ROLE
HOW NVIDIA PREDICTED TEN YEARS BACK THE FUTURE OF AI AND HOW ITS GPUs ARE PLAYING A KEY ROLE IN DEVELOPMENT OF DEEP LEARNING IN GAMES .
YOU MUST BE FAMILIAR WITH CPU( CENTRAL PROCESSING UNIT) . AND IF YOU ARE EITHER A GAMING ENTHUSIAST OR SOMEONE WHO KEEPS HIMSELF UPDATED WITH AI ADVANCEMENTS CHANCES ARE HIGH THAT YOU HAVE HEARD OF GPU’S. THEY HAVE EMERGED AS A BOON FOR AI DEVELOPMENT . AND SURELY THEY ARE DOING THEIR PART IN GAMING /GRAPHICS ADVANCEMENTS. WITH THE INCREASING USAGE OF DEEP LEARNING IN GAMES ,WE MIGHT SEE REALLY DIFFERENT VERSIONS OF GAMES BY THE END OF THIS DECADE . SMART AND INTELLIGENT ONES !!!!
LETS ANSWER THE BASIC QUESTION FIRST
WHAT IS A GPU
GPU STANDS FOR GRAPHICS PROCESSING UNIT .THE CORE PRINCIPLE ON WHICH GPU WORKS IS ” DIVIDE AND CONQUER ” . IN TECHNICAL TERMS WE CALL IT PARALLEL COMPUTING . THIS IS WHAT MAKES A GPU SO GOOD IN WHAT IT DOES . THIS IS IN CONTRAST TO A CPU WHICH CAN DO ONLY A HANDFUL OF CALCULATIONS AT ONCE . BUT HOW DOES A GPU DOES SO .WELL , THE ANSWER SEEMS OBVIOUS!!
NUMBER OF CORES IN A GPU
IF YOU ARE READING THIS ON A LAPTOP , OR IF YOU OWN ONE , YOU MUST HAVE HEARD TERMS LIKE DUAL CORE ,QUADCORE AND OCTACORE . SO BASICALLY A CPU CONSISTS OF A FEW CORES WHICH CAN HANDLE ACTIVITIES AND SUPPORT MULTIPROCESSING . IN CONTRAST A GPU CONSISTS OF A HUNDREDS OF CORES WHICH ALLOWS IT TO PERFORM PARALLEL COMPUTING BY BREAKING THE PROBLEM INTO SUBPROBLEMS.

SO HOW DOES THIS HELP IN TRAINING AI / DEEP LEARNING MODELS ?
I HOPE YOU ARE FAMILIAR WITH NEURAL NETWORKS. AND IF NOT ,SUGGEST READING THIS AND THEN COMING BACK TO UNDERSTAND BETTER . ITS EASY TO SEE HOW THE NUMBER OF CALCULATIONS NEEDED PER CYCLE OF BACKPROPOGATION /FEEDFORWARD INCREASES DRASTICALLY WITH THE INCREASING COMPLEXITY OF THE NETWORK. ( MORE NEURONS , HIDDEN LAYERS AND SO) . THE NUMBER OF CALCULATIONS SOON RISE TO MILLIONS . TRAINING SUCH A MODEL ON A CPU WOULD BE A TEST OF PATIENCE . GPUs , BECAUSE OF THEIR ABILITY TO BREAK UP THE PROBLEMS USES PARALLEL COMPUTING TO ITS AID WHICH SPEEDS UP THE PROCESS EXPONENTIALLY .
GPUs IN GAMING , COMPUTER VISION ,SUPERCOMPUTING
I THINK BY NOW YOU CAN REALISE HOW GPUs WOULD OUTPERFORM CPUs IN THESE APPLICATIONS . SURELY BY THE END OF NEXT DECADE THE AI WOULD EFFECT GAMES , HEALTHCARE AND MANY DOMAINS OF COMPUTER SCIENCE . CURRENTLY NVIDIA IS LEADING THE GPU PRODUCTION MARKET . NVIDIA GRAPHICS CARDS ARE FAIRLY POPULAR AMONG GAMERS . CURRENTLY THE GAMES THAT CLAIM TO USE ” AI ” ARE IN REALITY USING ALGORITHMS FOR OPTIMISING THE ENEMIES BEHAVIOUR /APPROACH . THEY WORK ON TECHNOLOGIES LIKE “FINITE STATE MACHINES” . BUT SOON WE EXPECT THINGS TO CHANGE . TILL THEN , GAME ON!!!!
LINEAR REGRESSION
WHAT IS LINEAR REGRESSION , USES IN MACHINE LEARNING ,ALGORITHMS
ONE OF THE MOST COMMON PROBLEMS WE COME ACROSS IN DAILY LIFE IS PREDICTING VALUES LIKE PRICE OF A COMMODITY , AGE OF A PERSON , NUMBER OF YEARS NEEDED TO MASTER A SKILL ETC . AS A HUMAN WHAT IS YOUR APPROACH WHEN YOU TRY TO MAKE SUCH PREDICTIONS . WHAT ARE THE PARAMETERS YOU CONSIDER . A HUMAN BRAIN HAS NO DIFFICULTY IN REALISING WHETHER A CERTAIN PROBLEM IS LINEAR OR NOT . SUPPOSE I TELL YOU THAT A CERTAIN CLOTHING COMPANY SELLS 2 CLOTHES FOR 2K BUCKS , 4 CLOTHES FOR 4K BUCKS , BUT 8 CLOTHES FOR 32K BUCKS .
IMMEDIATELY YOUR BRAIN TELLS YOU THAT SURELY THE LAST 8 CLOTHES MUST HAVE BEEN OF DIFFERENT QUALITY , OR BELONGING TO A DIFFERENT BRAND MAKING IT DIFFERENT FROM THE OTHER 2 CLOTH GROUPS . BUT IF I STATE THAT THE LAST 8 CLOTHES WERE FOR 8K BUCKS , YOUR BRAIN SIGNALS A LINEAR RELATION THAT IT CAN MAKE OUT OF THESE .
MANY DAILY LIFE PROBLEMS ARE TAGGED AS “LINEAR” OUT OF COMMON SENSE . A FEW EXAMPLES ARE :
- PRICES OF FLATS ON THE SAME FLOOR OF AN APARTMENT WOULD BE LINEARLY PROPORTIONAL TO THE NUMBER OF ROOMS .
- THE RENT OF A CAR WOULD BE LINEARLY PROPORTIONAL TO THE DISTANCE YOU TRAVEL .
“BY LINEARLY VARYING WE DON’T MEAN THAT WHEN PLOTTED ALL THE DATA POINTS WOULD STRICTLY PASS THROUGH A SINGLE LINE , BUT WHICH SHOWS A TREND WHERE THE GROWTH OF THE INDEPENDENT FUNCTION CAN BE VIEWED AS SOME LINEAR FUNCTION OF THE DEPENDENT VARIABLE + SOME RANDOM NOISE .”
THE MATH
YOU MUST BE AWARE OF EUCLIDIAN DISTANCE BETWEEN A STRAIGHT LINE AND POINTS WHICH DO NOT PASS THROUGH THE SAME . OUR AIM IS TO FIND A MODEL THAT USES THE DATA THAT HAS BEEN PROVIDED TO FIND OUT PREDICTIONS ON THE INDEPENDENT VARIABLE IF A CERTAIN VALUE OF THE DEPENDANT VALUE IS PROVIDED .
PUTTING IT MATHEMATICALLY ,
FOR A GIVEN DATA SET S –>{A:B} , WHERE A IS THE INDEPENDENT VARIABLE AND B IS THE CORRESPONDING DEPENDENT ONE FIND THE BEST PAIR (M,C ) SUCH THAT THE AVERAGE OF SUM OF SQUARES OF THE DIFFERENCE IN Y COORDINATES FOR EVERY B AND THE CORRESPONDING Y ON THE THE LINE Y=MA+C IS MINIMISED. WHERE THE AVERAGE IS TAKEN OVER THE NUMBER OF POINTS .
THE LOSS FUNCTION
NOW WE KNOW WHAT WE NEED TO MINIMIZE , THE VERY PARTICULAR QUANTITY IS TERMED AS “LOSS FUNCTION” . IT IS A MEASURE OF HOW GOOD YOUR MODEL IS FITTED TO THE TRAINING DATA . LETS SEE HOW SOME OF THE POSSIBLE ERROR FUNCTIONS THAT ARE USED LOOK LIKE :

WHERE et REFERS TO THE DIFFERENCE OF THE Y COORDINATE OF A CERTAIN DATA POINT AND THE PREDICTED Y VALUE FOR THE SAME , N= TOTAL NUMBER OF DATA POINTS
WE CONSIDER DISTANCES SO THAT POSITIVE AND NEGATIVE COORDINATE DIFFERENCES DO NOT CANCEL OUT . ALSO ONE ANOTHER REGULARLY USED LOSS FUNCTION IS RMLSE : (ROOT MEAN SQUARE LOGARITHMIC ERROR)

WHERE Yi ,Y hat ARE THE ACTUAL AND THE PREDICTED VALUES
RMS VS RMSLE
THE L IN THE RMSLE STANDS FOR “LOGARITHMIC ” AND THIS IS PREFERRED IF CERTAIN DATA POINTS HAVE AN EXPONENTIAL VARIATION , HENCE TAKING THE LOG FIRST WOULD SUBSTANTIALLY REDUCE THE EFFECT OF A POSSIBLE OUTLIER . BELOW IS A REPRESENTATION SUMMING UP HOW THE SCENARIO LOOKS LIKE . THE DATA POINTS ARE IN BLUE ,THE BEST FIT LINE PASSING THROUGH THEM , NOTE HOW YOU CAN SEE A “LINEAR RELATION ” BETWEEN THE DATA POINTS . SUCH CAPABILITY OF “SEEING ” A DATA SET’S BEHAVIOUR IS LIMITED TO HUMANS AND USING THIS INTUITION WE CHOSE TO FIND A “BEST FIT LINE ” RATHER THAN A PARABOLA OR ANY OTHER CURVE . SUCH BIAS TOWARDS A CERTAIN CLASSIFICATION DUE TO OUR CAPABILITIES IS CALLED “INDUCTIVE BIAS “

SOME MORE MATH
SUPPOSE FOR A SET –>{X:Y} WE HAVE WE WANT TO CALCULATE THE ESTIMATED FUNCTION y(hat ) AS SOME FUNCTION OF X . (REMEMBER A HAT ON TOP OF ANY VARIABLE MEANS IT IS AN ESTIMATE , NOT THE REAL VALUE , WE ALWAYS MAKE MODELS THAT ARE TRYING TO ESTIMATE A FUNCTION WHICH IS IN THEORY UNKNOWN) IN CASE OF LINEAR REGRESSION THIS CAN BE REPRESENTED BY THE SECOND EQUATION IN THE FIGURE :

NOW SUBSTITUTING THE VALUES IN RMSE LOSS FUNCTION WE GET:

SO GIVEN THE ABOVE EQUATION THIS IS WHAT WE TEND TO MINIMIZE NOW ,DIFFERENTIATING THIS W.R.T BETA 1 AND EQUATING IT TO ZERO, WE CAN GET THE FOLLOWING RESULTS
FOLLOWING ARE THE VALUES OF THE VARIABLES WE NEEDED TO FIND :

THERE ARE VARIOUS WAYS WE CAN OPTIMISE THE ABOVE MODEL TO AVOID OVER FITTING . REGULARISATION TECHNIQUES LIKE RIDGE REGRESSION , LASSO AND ELASTINETS (COMBINATION OF BOTH ) ARE USED WHERE WE PENALISE MODELS THAT TEND TO OVER FIT . THIS IS DONE USING DIFFERENT LOSS FUNCTIONS THAN THE ONES WE HAVE USED HERE ! THE DIFFERENCE ARISES FROM INTRODUCING ADDITIONAL TERMS IN THE ALREADY DISCUSSED LOSS FUNCTION
DO LEAVE COMMENTS /QUERIES …………………..