AI on cloud, AI on AWS

AI: Leveraging AWS and GCP AI Services


Artificial Intelligence (AI) has emerged as a transformative force, revolutionizing industries and shaping the way we approach problem-solving and innovation. Cloud service providers like Amazon Web Services (AWS) and Google Cloud Platform (GCP) have made AI accessible to businesses of all sizes, empowering them with advanced AI technologies to drive growth and efficiency. In this comprehensive guide, we will explore the diverse AI services offered by both AWS and GCP, discuss their potential applications, and understand how businesses can seamlessly integrate these powerful tools with their backend systems and APIs.

AI Services Offered by AWS

  1. Amazon Rekognition: Advanced Image and Video Analysis

Amazon Rekognition, a flagship AI service from AWS, enables businesses to analyze images and videos using cutting-edge computer vision algorithms. With features like facial recognition, object detection, and scene understanding, Rekognition finds applications in various industries. E-commerce companies can utilize it for product tagging and personalized recommendations, while security systems can enhance surveillance with real-time face recognition.

  1. Amazon Polly: Natural Text-to-Speech Conversion

Amazon Polly empowers businesses with the capability to convert written text into lifelike speech in multiple languages. Audiobook creation, language translation, and voice-enabled applications are just a few examples of Polly’s versatile applications, enhancing accessibility and user engagement.

  1. Amazon Lex: Building Intelligent Conversational Interfaces

Amazon Lex simplifies the development of interactive chatbots and conversational interfaces that offer personalized customer experiences. Leveraging natural language understanding (NLU) capabilities, Lex engages users with human-like interactions, enhancing customer support and satisfaction.

  1. Amazon Comprehend: Extracting Insights from Unstructured Text

Amazon Comprehend harnesses the power of Natural Language Processing (NLP) to extract valuable insights from vast amounts of unstructured text. Businesses can perform sentiment analysis, entity recognition, language detection, and topic modeling, thereby understanding customer feedback, market trends, and brand sentiment.

  1. Amazon SageMaker: Simplifying Machine Learning Workflows

Amazon SageMaker offers a fully-managed environment for building, training, and deploying machine learning models. This service streamlines the process of developing intelligent applications, allowing data scientists and developers to efficiently collaborate.

  1. Amazon Forecast: Predicting Future Trends with Machine Learning

Amazon Forecast enables businesses to make data-driven decisions by predicting future trends and demands. Leveraging AI and machine learning, this service optimizes inventory management, sales forecasting, and resource allocation.

  1. Amazon Personalize: Delivering Tailored Recommendations

Amazon Personalize enables businesses to offer personalized product and content recommendations based on user behavior and historical data. This personalization enhances customer engagement, satisfaction, and conversion rates.

AI Services Offered by GCP

  1. Cloud Vision API: Powerful Image Analysis

Google Cloud’s Vision API allows businesses to integrate AI-powered image analysis capabilities into their applications. With features like label detection, face detection, and OCR (optical character recognition), Vision API finds use in e-commerce, healthcare, and content management industries.

  1. Text-to-Speech API: Natural Language Processing

The Text-to-Speech API from GCP converts written text into natural-sounding speech, offering multilingual support and a range of voice options. Applications include voice-enabled applications, language translation, and accessibility features for visually impaired users.

  1. Dialogflow: Building Conversational Interfaces

Dialogflow enables businesses to create chatbots and conversational interfaces with NLU capabilities. The platform supports multiple messaging platforms, making it suitable for various use cases, including customer support and virtual assistants.

  1. Natural Language API: Understanding Textual Data

Google Cloud’s Natural Language API provides sentiment analysis, entity recognition, and syntax analysis to gain insights from unstructured text. Businesses can use it for customer feedback analysis, content classification, and trend monitoring.

  1. AI Platform: Scalable Machine Learning

GCP’s AI Platform offers a scalable infrastructure for machine learning model development and deployment. Data scientists and developers can collaborate effectively to build intelligent applications.

  1. Cloud AI Platform Notebooks: Collaborative Data Science

The AI Platform Notebooks facilitate collaborative data science, providing a managed Jupyter Notebook environment for data exploration, model development, and visualization.


AWS and GCP offer a comprehensive suite of AI services that enable businesses to harness the power of AI for enhanced growth and efficiency. By leveraging services like Rekognition, Polly, Lex, Comprehend, SageMaker, Forecast, Personalize from AWS, or alternatives like Vision API, Text-to-Speech API, Dialogflow, Natural Language API, AI Platform, Cloud AI Platform Notebooks from GCP, organizations can gain valuable insights, optimize operations, and deliver personalized experiences to their customers.

Integrating these AI services with backend systems and APIs is achievable through the use of respective SDKs and APIs, making it easier for businesses to embrace AI and unlock its full potential.

In the upcoming blogs we will discuss about Azure and each service in detail.

SQL for beginners

SQL For Analytics

About the author

Aparna Mishra

Data science Analyst


SQL stands for Structured Query language , also pronounced as ‘Sequel’. Itis the language for communication between user and database. The various SQL databases are Oracle, MS SQL server, MySQL etc.

Sublanguages in SQL  :

  • Data Definition Language (DDL)
  • Data Manipulation language (DML)
  • Data Query Language (DQL)
  • Data Control Language (DCL)
  • Transaction Control language (TCL)

Importing Data from File:

Syntax :

COPY {table_name} (col1, col2, col3 , ……colN)

FROM “ path / location of file “ DELIMITER ‘ , ‘ CSV HEADER ;



FROM  ‘ C:\Users\aparna\Downloads\orders.csv ‘   —dummy file path / location



    FIRSTROW = 2,                         —  as the 1st row is header

    FIELDTERMINATOR = ‘ , ‘ ,         — delimiter

    ROWTERMINATOR = ‘ \n ‘ ,            –Used to shift the control to next row




COPY {table_name} from “ location of file / path “ delimiter ‘ , ‘ ;

DDL Commands and Uses :

  • CREATE : It is used to create a table and define the columns and their data types.


( column1 data type,

column2 data type ,

column3 data type ,

…… columnN data type) ;

Example :

  •   ALTER : It is used to modify the database table.
  1. Adding new column – ALTER TABLE {table_name} ADD { column_name}     VARCHAR(20);

2. Dropping the existing column –  ALTER TABLE {table_name} DROP COLUMN {column_name} ;

3. Changing of data type – ALTER TABLE {table_name} ALTER COLUMN {column_name}  char(20) ;

  • DROP : Drops the table permanently from the database.

DROP TABLE {table_name};

  • TRUNCATE : Removes the data and not the structure of the table.

TRUNCATE TABLE {table_name} ;

DML Commands and Uses :

  • INSERT : INSERT INTO statement is often used to add new records into the existing table.


( column1 , column2, ……) VALUES ( value1, value2,……) ;

Example :

INSERT INTO student ( column1, column2 ) VALUES ( value1, value2 );

  • UPDATE :  Used to modify the existing records in a table.


SET column1 = ‘value1’,

column2 =  ‘value2’,


WHERE { condition } ;

Example :

  • DELETE :  Used to delete the existing records from a table.

DELETE FROM { TABLE NAME } WHERE { condition } ;

    DELETE FROM student WHERE student_id = 6;

DQL Commands and Uses :

  • SELECT : Used to fetch data from a database table.

    SELECT * from { TABLE NAME } WHERE { condition };

  1. SELECT student_id ,first_name, std from student ;

2. SELECT student_id from student WHERE Class = ‘VII’ ;

3. SELECT cust_id, name , address from customer where age > 40 ;

  • SELECT DISTINCT : The DISTINCT keyword is used with SELECT to eliminate all the duplicate records and fetch only the unique ones.

     SELECT DISTINCT { column name } FROM {TABLE NAME };

  1. SELECT DISTINCT address  FROM customer ;
  1. SELECT DISTINCT first_name FROM student ;

DCL Commands and Uses :

  • GRANT : Used to provide any user access or privileges for the database.



  • REVOKE :  Used to take back privileges / access for the database.



TCL Commands and Uses :

  • COMMIT – It is used to make the transaction permanent.

UPDATE { TABLE NAME } SET column1 = ‘value1’ WHERE { condition };


  1. UPDATE student SET Name = ‘XYZ’ WHERE Name = ‘ABC’ ;


2. UPDATE orders SET order_date  = ‘2020-09-18’’

WHERE order_id = 15 ;


  • ROLLBACK – It is used so that the database can be restored to the state when it was last committed or to the state when last changes were made to the database.

ROLLBACK TO savept_name;

  • SAVEPOINT – This command is used to save a transaction temporarily so that we can rollback to that point whenever we need to.

     SAVEPOINT savepoint_name ;


  1. Create a table and insert records into it.
  1. To use the TCL commands in SQL, we need to first initiate a transaction by using the BEGIN / START TRANSACTION command.


  1. Updating the table and using COMMIT –

UPDATE student SET age = 14 WHERE student_id = 4 ;


Using COMMIT makes sure that the command will be saved successfully.

The output after COMMIT will give us a table with saved changes.

  1. Using Rollback – ROLLBACK;



The above command ensures that the changes made by DELETE command are reverted back successfully.

5. Using SAVEPOINT – We can create as many SAVEPOINTs as we want after doing some changes to the table.

    SAVEPOINT {savepoint_name};

UPDATE orders SET amount = 12900 WHERE order_id = 15;


DELETE FROM orders WHERE order_id = 12;


6. Now if we want to go back / revert back to the table before DELETE OR UPDATE COMMAND we can ROLLBACK TO upd / del, and if we want to roll back to the table when records were just inserted then we can ROLLBACK TO A.


ROLLBACK TO save_upd;

select * from orders;

When we run these commands we get the records before any of the changes were caused.

And these commands will give back the table as they were before any changes

Aggregation functions in SQL –

Aggregate functions are called only in the SELECT / HAVING clause.

  • AVG( ) – Returns floating point value.

SELECT AVG (column_name) FROM TABLE;

  • COUNT( ) – Returns the number of rows which is an integer value.

This query counts and returns the number of records in the table

2. SELECT COUNT( * ) FROM TABLE WHERE { condition };

  • MAX( ) – Returns the maximum value.

SELECT MAX (column_name) FROM TABLE;

  • MIN( ) – Returns the minimum value.

     SELECT MIN (column_name) FROM TABLE;

  • SUM( ) – Returns the total sum.

SELECT SUM (column_name) FROM TABLE;

  • ROUND( ) – It can be used to specify precision after decimal.

SELECT ROUND(AVG ( column_name ),3 ) FROM TABLE;

Use of ORDER BY :

It is used to sort rows based on a specific column either in ascending / descending order.

Syntax :

  1. SELECT Column1, Column2 FROM TABLE ORDER BY column1 ASC / DESC ;

2. SELECT column1 , column2 WHERE { condition } ORDER BY;

Use of LIMIT :

It allows us to limit the number of rows returned for a query.

Syntax :


ORDER BY column_name DESC/ ASC

LIMIT 10 ;

Use of BETWEEN  Operator:

It can be used to get a range of values.

Syntax :

  1. Value BETWEEN LOW and  HIGH
  2. Value NOT BETWEEN LOW and HIGH.

SELECT COUNT(*) FROM orders WHERE amount BETWEEN 800 AND 2000;

This statement will return the number of orders between 800 and 2000.

So we need to provide a range for BETWEEN clauses.

Use of IN Operator :

It checks if the value is included in the list of options provided.

Syntax :

  1. SELECT * FROM TABLE WHERE column_name IN (option1, option2, option3,…..) ;
  1. SELECT * FROM orders WHERE amount IN (200, 800, 9000) ;
  1. SELECT * FROM orders WHERE order_id NOT IN (12,13,14) ;

This way the IN / NOT IN Operator can be used to fetch the required records from the table, it acts as a filter to get only those records which you entered into the list of options.

Use of LIKE Operator :

It is helpful in pattern matching against string data with the use of wildcard characters.

LIKE is case – sensitive.

  1. I want to get those data from the table where all names begin with ‘D’.

SELECT * FROM student WHERE first_name LIKE ‘D%’  ;

  1. All names that end with ‘a’.

SELECT * FROM orders WHERE first_name LIKE “%a” ;

3. All names with ‘on’  in it.

SELECT * FROM student WHERE first_name LIKE “_on%” ;

4. All email ids with ”ar” in the middle.

SELECT * FROM Customer WHERE first_name LIKE ‘%ar%’ ;

This way we can use LIKE to find / match patterns.

Use of Group By Clause:

Group By allows us to aggregate the columns according to some category. The GROUP BY

Clause must appear right after a FROM or WHERE.

Syntax :

SELECT column1, column2, SUM( column3)

FROM column4

WHERE column2 IN (option1 , option2)

GROUP BY column1, column2 ;

SELECT COUNT(cust_id) , address

FROM customer

GROUP BY address;

Use of HAVING clause :

It allows us to use the aggregate result as a filter along the GROUP BY.

Syntax :

SELECT COUNT(cust_id), address

FROM customer

GROUP BY address

HAVING COUNT(cust_id) > 4;

Use of AS :

‘AS’ keyword is used to assign alias names to a column or a table.

Syntax :

SELECT { column_name } AS { column _alias } FROM { table_name };


A view is a virtual table , it is not a physical table. It is created to see a set of results , because it provides an overview of the result-set.

Syntax :

CREATE { OR  REPLACE } VIEW { view_name } AS

SELECT { column1 , column2 , column3 ,….. columnN }

FROM { table_name }

WHERE { conditions } ;

Joins :

Types of Joins :

  • Inner Join
  • Left Join
  • Right Join
  • Full Join

INNER JOIN : It compares each row of table ” A “ with each row of table “ B “ to find all of the rows which satisfy the join predicate and returns back the result – record.


SELECT columns

FROM tableA


ON tableA.required_column = tableBrequired_.column ;

LEFT JOIN : Left join returns all the records from the left table , even if there are no matched in the right table , it shows null where the records are not matched

Syntax :

SELECT tableA.column1 , tableB.column2

FROM tableA

LEFT JOIN tableB ON tableA.common_field = tableB.common_field

ORDER BY tableA.required_column , tableA.required_column  ;

RIGHT JOIN :  Returns all the records from the right table , even if there are no matches in the left table , it shows null where the records are not matched.


SELECT tableA.col1 , tableB.col2

FROM tableA


ON tableA.required_column = tableB.required_column ;

FULL JOIN : Full Join combines the results of both left and right tables and returns the records.

Syntax :

SELECT tableA.col1 , tableB.col2

FROM tableA


ON tableA.required_column = tableB.required_column ;



In this post we try to discuss Basics of cloud , Artificial intelligence in general and also how google cloud AI provides services for the same . Before getting into Google cloud services in particular , lets try to understand the purpose of cloud computing and the resources they offer.

What is Cloud computing?

Over the last decade a lot of services have turned online. Movies, songs, shopping and storage services and much more. Many businesses are now providing online services/shops/courses.

Traditionally to come up with your own online service , you would have required to buy and assemble hardware devices for computing , storage and security purposes. You would have required to build giant servers and data warehouses .

And if your business had to upscale itself you would have required setting up new hardware devices according to need.

Yet another problem would have been handling customer load and also your hardware location to become a single point failure service (as in case of any technical issue at that location , your entire service would go down.

Google Cloud or cloud computing in general tries to provide a solution for these problems. Imagine someone providing a service that allow you to focus on building your product, its marketing and provides you with managed hardware resources , ensures data security and allows you to scale up and down according to the need of the hour.

google cloud AI

To summarize The features of cloud computing are:

Cloud FeaturePurpose
MaintenanceYou can choose what level of administrative of management you want to handle and what the cloud service provider should handle
data securityCloud services like google cloud /AWS are highly secure and you can be ensured of data security
multi-point failureCollapse of a certain data center/Temporary issues of a certain data center wont cause your service to fall down
Storage servicesYou get huge storage capacities
Pay as you goPossibly one of the best features of the cloud is that it assigns compute resources according to the traffic on your service and hence you pay for only what you use .
Hosting servicesIt provides you to host applications on the cloud ,
BIG data and machine learning (cloud AI)You can run big data clusters on cloud . Also you can run machine learning and deep learning models (training and deployment ) on the cloud

Now we have a basic understanding of what cloud computing is, lets have a brief discussion on AI.

What is AI?

AI models are softwares that can make human-like intelligent decisions/predictions when introduced to new or real world data . Intelligent learning based softwares are required when rule based softwares are not enough to solve a problem.

We won’t dive into great detail regarding the mathematics/Algorithms behind ml dl or AI in this post but lets list down a few cases where online businesses utilize AI:

  1. recommendation systems (Netflix ,amazon , flipkart)
  2. Language translation tasks
  3. Sentiment analysis in tweets, comments and products reviews
  4. Fraud detection using machine learning
  5. Object/face detection, classifying/locating objects
  6. Document classification

and the list goes on and on. Have a look on the stats below to have an estimate of major industries are incorporating AI in their respective domains

image ref: NVIDIA BLOG

Now we have a look at Google cloud AI services :


Google cloud provides many services in general. Lets list down a few:

GCP computeCompute engine, Kubernetes engine and VMware engine
Serverlesscloud run, cloud functions, app Engine
StorageFilestore, Cloud Storage, Data Transfer
DatabasesBigtable , data store database migration , firestores , memorystore, spanner and SQL
Application IntegrationAPI Gateway, Cloud Scheduler
Big dataComposer, Data proc, dataflow and data stream
Networking and operationsnetwork services ,VPC network, Network security ,Monitoring, debugger , logging
Artificial intelligence Vertex AI , AI building blocks , AutoML ,Text-to-Speech and more

AI services in detail

Now lets focus our attention to discuss the Google cloud AI services in particular

google cloud AI

GCP allows you to prepare your data , train models and deploy models into production.

Also GCP AI and machine learning products helps you to select features according to your specific needs. Below are the GCP AI and ml products:

  1. Build with AI – Products like Auto ml , Ai infrastructure
  2. Conversational AI- APIs for speech to text and text to speech conversion
  3. AI for documents-Natural language , translation tasks computer vision tasks
  4. AI for industries- Media translation , Recommendation systems, Healthcare NLP

As we have seen that GCP allows you to use big data queries to get and preprocess your data , bwlow you can visualize the entire workflow of the AI project cycle from beginning to deployment . Notice how one can build notebooks ,pipelines all on cloud premises itself.

google cloud AI



Hypothesis testing and p values

About the author

Aparna Mishra


Hypothesis testing ,p-values and the statistics involved is one of the most commonly asked topics in interviews , hence its always better to be prepared with the minute mathematical details as well as real life examples/ business scenarios to be used as examples to express your point better . This post aims to help you do the same .Lets begin!

What is a Hypothesis?

A Hypothesis is a statement that can be tested either by experiment or observation, provided we have past data. For eg – We can make a statement like “ Cristiano Ronaldo is the best footballer “ , and we can test the statement based on all the data of the past football matches.

Steps involved in Hypothesis testing :

  • Formulating a Hypothesis.
  • Finding the right test for Hypothesis.( Outcomes of tests refer to the population parameter rather than sample statistics ).
  • Executing the test.
  • Making a decision on the basis of the test.

What cannot be a Hypothesis?

Not a Hypothesis if the statement cannot be tested and we have no data regarding it.

There are two Hypothesis:

  • Null Hypothesis – denoted by- H0
  • Alternate Hypothesis – denoted by- H1

The Null Hypothesis is the statement we are trying to reject. Therefore, the Null Hypothesis is the present state of affairs while the Alternate Hypothesis is our personal opinion.

Null Hypothesis

A Null Hypothesis is the hypothesis that is to be tested for rejection after assuming it to be true. The concept of Null Hypothesis is similar to “Innocent until proven guilty”. So, is considered True until it is rejected.

Alternate Hypothesis :

Alternate Hypothesis is the opposite of Null Hypothesis. Whatever we assume our Null Hypothesis to be , the Alternate Hypothesis is the complement of that assumption.

Simple and Composite Hypothesis :

Simple Hypothesis is when the Hypothesis statement has an exact value of the parameter.

Example – A textile company claiming that it exports its products and makes $ 10,000 per month.

Composite Hypothesis is when we have a range of values in the Hypothesis statement.

For example – the average height of girls in the class is greater than 5 feet.

Two Tailed Test :

One Tailed Test :

If the Alternate Hypothesis gives the alternate in only one direction for the parameter specified in the Null Hypothesis, it is called a One-tailed test.

Critical region :

Also called the Rejection Region. It is the set of values for the test statistic for which the Null Hypothesis is rejected which means if the observed test statistic is in the critical region then we reject the Null Hypothesis and accept the Alternative Hypothesis.

Confidence Interval :

Type I and Type II error :

Type I error :

Type II error :

P- value :

The p-value is the smallest level of significance at which a null hypothesis can be rejected and this is the reason why many tests give p-value and is more preferred since it gives us more information than the critical value.

The smaller the p-value, the stronger the evidence that we should reject the null hypothesis.

  • If p > .10 → “not significant”
  • If p ≤ .10 → “marginally significant”
  • If p ≤ .05 → “significant”
  • If p ≤ .01 → “highly significant.”

Kolmogorov–Smirnov(KS test)

KS- test: Understanding and recognizing the distribution of data points is one of the important tasks of any Machine learning problem. Many mathematical assumptions , feature engineering techniques or algorithm selections are dependent on the nature of distribution of the data points.

This inevitable study of distributions often lead to a situation where where we need to compare 2 distributions . A simple example of such a case is when you need to compare whether your train and test points come from same distribution. Another question that arises in this context is that how to “Quantify” differences in 2 distributions . Its not as easy as subtracting 2 scalars or vectors.

KS test

ks test or Kolmogorov–Smirnov test is a method used to find out whether 2 distributions are similar or not .Concepts like hypothesis testing, p-value are pre-requisites for understanding this test. KS test utilizes the concept of hypothesis testing.

In any hypothesis test there is a null-hypothesis , which in this case is ” The 2 samples come from same distribution”. Before understanding when or how do we pertain or reject this hypothesis lets look at a few steps below:

  1. So the problem statement is that we need to conclude whether 2 collections of data-points {Xn,Ym} come from the same distribution. (n, m are the number of points in each sample set)
  2. The first step of KS test is to draw the CDF of both the sample sets.
  3. Now, we need an expression or function which can quantify the difference between these 2 CDFs .The below expression , known as K-S Statistic solves or purpose:

After we Know the KS statistic +the null hypothesis we come to the part which tells us when to reject the null hypothesis.

Have a look at the below image:

Comparison with p value

from the above expression of Dn,m , one can easily get a relation of the form F<alpha, ( by simple squaring and taking exponent) . Below you can see the result:

exp{-2(Dn,m)2 *(nm/(n+m))} < alpha

comparing the above expression with ( p-value< alpha) of hypothesis testing ,we can say that the LHS is the p-value of ks-test (notice it depends on the number of points in both sample sets).

The Kolmogorov–Smirnov statistic in more than one dimension

One approach to generalizing the Kolmogorov–Smirnov statistic to higher dimensions which meets the above concern is to compare the cdfs of the two samples with all possible orderings, and take the largest of the set of resulting K–S statistics. In d dimensions, there are 2d−1 such orderings.


More to come!