Machine Learning

Comment Analysis using NLP

Seen Jarvis of Iron Man and wanted your own AI, so exploring the world of ML and AI. Well, it is not that in a project you could reach there but when your own computer reaches the capacity to tell about how the message is good or bad about just a statement then you are a step further to it. May be someday this would help to bring Jarvis to you.

This is the Project you should do to practice in Natural Language Processing to move ahead from basic ML libraries like pandas and NumPy to reach a place of Data Scientist.

Read more..

Comment Analysis using NLP project Looking to build projects on Machine Learning?:

Machine Learning Kit will be shipped to you and you can learn and build using tutorials. You can start for free today!

1. Machine Learning (Career Building Course)

2. Fraud Detection using Machine Learning

3. Machine Learning using Python

4. Movie Recommendation using ML

5. Handwritten Digits Recognition using ML

6. Machine Learning Training & Internship

7. Brain Tumor Detection using Deep Learning


OUTLINE:

In Machine learning the problem is not with the algorithm so much we just need proper data and we will be making our own data on own using web scraping and then use it for training and testing then use it for review/comment analysis. Here we will be using nltk library for natural language processing. We will use other libraries too, like bs4, sklearn, etc. The thing which would be not joyful will be the time of rendering of data and data extraction, otherwise, it is a very simple code to understand.

PREREQUISITE:

The given libraries should be installed in your python package:-

  • request
  • time
  • string
  • bs4 ( for beautifulsoup)
  • csv
  • pandas
  • nltk
  • sklearn

Latest projects on Machine Learning

Want to develop practical skills on Machine Learning? Checkout our latest projects and start learning for free


REQUIREMENTS:

Hardware:

  • One Laptop or Desktop with any OS (Linux/Windows/iOS)

Software/Technology:

  • A Browser ( Chrome/Firefox/Opera/Safari/IE) to run the program. I prefer chrome or firefox as it provides a console to see behind the work of program like like the IDE of python ( to use just right-click on the mouse and select ‘Inspect’ or Ctrl+Shift+I to open it and switch to console tag in this window)
  • For this project, I prefer an IDE ( Interactive Development Environment ) that too Jupyter Notebook. There are many benefits using it, one is while programming you can get the result of parts of the whole program. 
  • But if wanted you can use a text editor ( Visual Studio(VS) Code/Atom/Sublime/Notepad ) to code. I prefer VS code as it completes the code from its libraries and themes are attractive.
  • Internet Connection

IMPLEMENTATION:

We will complete it in three parts:

1) Data extraction using web scraping 

2) Training and Testing of our model

3) Practically checking a comment

DATA EXTRACTION

1. We need a large data so we go to the place where we can get it, I got the data from the review section of iPhone 6 on Flipkart site ( direct link below ):

'https://www.flipkart.com/apple-iphone-6-gold-32-gb/product-reviews/
itmewxhuufbzchrn?pid=MOBEWXHUSBXVJ7NZ&lid=
LSTMOBEWXHUSBXVJ7NZPXN7ZL
&marketplace=FLIPKART' 

2. This opens the first page of Iphone 6 , we will first extract the data from here required for us

3. we need to understand what we are extracting, two things one is rating other is a review section, we will take ‘3’ as neutral, ‘< 3’ as negative and ‘> 3’ as positive comments

4. while we could individually extract each thing rating and review, but I am extracting from the common way

5. we use the div class under which these both come, here it is ‘col _390CkK _1gY8H-' 

5. import the beautifulsoup from bs4. Then using get method from requests we extract whole page and save in a variable 

response = requests.get(“URL”)

requesting website and downloading its content using get method

6. After this we use instantiating soup object which accepts what to find and how to find

soup = BeautifulSoup(response.text, 'lxml')

NOTE: if lxml produces error use 'html.parser'

7. Using find all method to find all occurrences of class col _390CkK _1gY8H- and put data in variable content/reviews (whatever you want )

8. We add the data individually in a dummy variable following way:

for i in reviews:

a.append(i.get_text())

9. Now using looping we add the review and rating in different list ( the name should predefined ) with conditioning like:

for i in a:

if int(i[0])<3:

view.append(i[1:])

rate.append("negative")

elif int(i[0])==3:

view.append(i[1:])

rate.append("neutral")

else:

view.append(i[1:])

rate.append("positive")

10. Now we need to extract all the data from other pages too

11. Now we need to put all code at one place and amend the url in following way:

'https://www.flipkart.com/apple-iphone-6-gold-32-gb/product-reviews/
itmewxhuufbzchrn?pid=MOBEWXHUSBXVJ7NZ
&lid=LSTMOBEWXHUSBXVJ7NZPXN7ZL
&marketplace=FLIPKART&page=' + str(i)

12. put this in for loop where i in range 2 to 1649 (because we have 1648 pages to add the data)

Note: It is a large data set so scraping may take hours about 2.5 hours

13. Now we need to change whole data in csv form

14. using file handling we open using with open a file in writing format and declare a writer in following way:

hello = csv.writer(file name)

15. Declare first row as ‘reviews’ , ‘rating’

16. Now with for loop add all data from a list in the file

Note: remember to declare your file with .csv extension

Your CSV data is ready. It is a large data set close to 1 crore reviews, It takes time rendering this data so can reduce the page number

If you want, you can get a better data set from kaggle and other sites since this can be a biased review collection, just modify the complete code

TRAINING AND TESTING MODEL

1. we will be using nltk and sklearn libraries with string aand other packages within them

2. first we read the the csv file using pandas library

3. we import nltk , string library and stopwords from nltk.corpus and then declare function to remove punctuations and stop words from our data

4. now we import the countvectorizer and TfidTransformer from sklearn.feature extraction .text and also we import multinomialNB from sklearn.naive_bayes

5. we need to divide our data for training and testing so we import trai_test_split from sklearn.model_selection and divide data in following way

msg_train,msg_test,label_train,label_test = 
train_test_split(csv['reviews'],csv['ratings'],
test_size=0.2,random_state=101)

6. Now we import Pipeline from sklearn.pipline library

7. now we pipeline our data in our model following way:

model = Pipeline([

('bow',CountVectorizer(analyzer=text_process)),

('tfidf',TfidfTransformer()),

('classifier',MultinomialNB())

])


How to build Machine Learning projects Did you know

Skyfi Labs helps students learn practical skills by building real-world projects.

You can enrol with friends and receive kits at your doorstep

You can learn from experts, build working projects, showcase skills to the world and grab the best jobs.
Get started today!


8. Now we train our data with following step:

pipeline.fit(msg_train,label_train)

9. this may take hours as per dataset size and processor of system

10. after train we cross-check with test:

predictions = pipeline.predict(msg_test)

11. Now print the report:

print(classification_report(label_test,predictions))

12. now to check our own comment we do following way:

pipeline.predict([‘comment’])

The work is done program is ready but in RAM if you want to fix this you can using ‘pickle’ library or joblib from sklearn , just google it for more

Kit required to develop Comment Analysis using NLP:
Technologies you will learn by working on Comment Analysis using NLP:
Comment Analysis using NLP
Skyfi Labs Last Updated: 2022-03-21





Join 250,000+ students from 36+ countries & develop practical skills by building projects

Get kits shipped in 24 hours. Build using online tutorials.

More Project Ideas on Machine-learning

SEED SOWING MACHINE
Prediction of compressive strength of concrete by machine learning
Automatic answer evaluation machine
Detection of glaucoma
Detecting Suicidal Tendency using ML
Stock Price Prediction using Machine Learning
Wine Quality Prediction using Linear Regression
Iris Flower Classification using Machine Learning
How to Predict Bigmart Sales with Machine Learning(ML)
Social Media Sentiment Analysis using twitter dataset
Sales Forecasting Using Walmart dataset
Health Care Improvement using Machine Learning
Enron Investigation
Human Activity Recognition
MNIST handwritten digit classification
Moneyball sports analyzer using machine learning
Handwriting reader using Machine Learning
Music Recommendation using Machine Learning
Movie recommendation system based on emotion using python
Vehicle Number Plate detection using Image processing and Machine Learning techniques
Movie success prediction using Data mining
Phishing Site detection using Machine learning
Students Performance Prediction using Machine Learning
Speech Emotion Recognition
Detecting Parkinson's Disease using Machine Learning
Chatbox Machine Learning project
Image Caption Generator
Customer Segmentation
Fraud detection using Machine Learning
AI-based Voice Assistant
Develop A Movie Ticket Pricing System Using Machine Learning
Object detection using Machine Learning
Coronavirus outbreak prediction project using Machine Learning
Breast Cancer Prediction using Machine Learning
House Price Prediction using Machine Learning and Python
Brain Tumour Detection using Deep Learning
Sports predictor using Machine Learning
Handwritten document recognition system using machine learning
Disease Prediction using water quality dataset (ML)
Comment Analysis using NLP
Personality Prediction Project With ML and Python
Design An Online Grocery Recommendation System with ML
Bitcoin Price Prediction using Machine Learning
Road accident analysis using machine learning
Food Image Detection Using CNN and Machine Learning
Loan prediction using machine learning

Subscribe to receive more project ideas

Stay up-to-date and build projects on latest technologies