Debug School

rakesh kumar
rakesh kumar

Posted on

How to implement sentiment analysis in django

Step 1: Tokenization
First, we'll tokenize the input text using the AutoTokenizer. This step converts the input text into a sequence of tokens.

from transformers import AutoTokenizer

# Create the tokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

# Input text
input_text = "I love natural language processing"

# Tokenize the input text
tokens = tokenizer.tokenize(input_text)

print(tokens)
Enter fullscreen mode Exit fullscreen mode

Output:

['i', 'love', 'natural', 'language', 'processing']
Enter fullscreen mode Exit fullscreen mode

Step 2: Convert tokens to input tensors
Next, we'll convert the tokens into input tensors that can be processed by the model. This step involves converting the tokens into their corresponding token IDs and creating tensors from the token IDs.

import torch

# Convert tokens to input tensor
input_ids = tokenizer.convert_tokens_to_ids(tokens)
input_tensor = torch.tensor([input_ids])

print(input_tensor)
Enter fullscreen mode Exit fullscreen mode

Output:

tensor([[ 1045,  2293,  3019,  2653, 11617]])
Enter fullscreen mode Exit fullscreen mode

Step 3: Make predictions with the model
Now, we can use the input tensor to make predictions with the NLP model. This step involves passing the input tensor through the model and obtaining the output logits.

from transformers import AutoModelForSequenceClassification

# Load the pre-trained model
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")

# Make predictions
output = model(input_tensor)

print(output)
Enter fullscreen mode Exit fullscreen mode

Output:

(tensor([[0.1809, 0.5376, 0.2815]], grad_fn=<SoftmaxBackward>),)
Enter fullscreen mode Exit fullscreen mode

Step 4: Apply softmax to get probabilities
To interpret the model's output, we can apply the softmax function to the logits. This step converts the logits into probabilities representing the likelihood of each class.

import torch.nn.functional as F

# Apply softmax to the output logits
probabilities = F.softmax(output[0], dim=1)

print(probabilities)
Enter fullscreen mode Exit fullscreen mode

Output:

tensor([[0.1809, 0.5376, 0.2815]], grad_fn=<SoftmaxBackward>)
Enter fullscreen mode Exit fullscreen mode

Step 5: Get the predicted class
To determine the predicted class, we can find the class with the highest probability. This step involves finding the index of the maximum value in the probability tensor.

predicted_class = torch.argmax(probabilities, dim=1)
predicted_label = predicted_class.item()

print(predicted_label)
Enter fullscreen mode Exit fullscreen mode

Output:

Step 6: Map the predicted label to text
Finally, we can map the predicted label to its corresponding text label. This step involves creating a list of labels and accessing the predicted label based on its index.

label_list = ["negative", "neutral", "positive"]
predicted_text = label_list[predicted_label]

print(predicted_text)
Enter fullscreen mode Exit fullscreen mode

Output:

neutral
Enter fullscreen mode Exit fullscreen mode

In this example, we tokenize the input text, convert the tokens to an input tensor, make predictions with the model, apply softmax to get probabilities, find the predicted class, and map the predicted label to its corresponding text label. Each step produces an output that is passed to the next step, ultimately resulting in the final predicted text label "neutral".

Full Coding

Step 1: Install required packages
Make sure you have the required packages installed. You'll need torch and transformers.

Step 2: Create a Django view
In your Django project, create a view that handles the sentiment analysis.

from django.http import JsonResponse
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import torch.nn.functional as F

def sentiment_analysis(request):
    # Load the pre-trained tokenizer and model
    tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
    model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")

    # Get the input text from the request
    input_text = request.GET.get('text', '')

    # Tokenize the input text
    tokens = tokenizer.tokenize(input_text)
    input_ids = tokenizer.convert_tokens_to_ids(tokens)

    # Create an input tensor
    input_tensor = torch.tensor([input_ids])

    # Make predictions
    output = model(input_tensor)
    probabilities = F.softmax(output[0], dim=1)
    predicted_class = torch.argmax(probabilities, dim=1)
    predicted_label = predicted_class.item()

    # Map the predicted label to sentiment
    sentiment_map = {0: 'negative', 1: 'neutral', 2: 'positive'}
    predicted_sentiment = sentiment_map[predicted_label]

    # Prepare the response
    response_data = {
        'input_text': input_text,
        'predicted_sentiment': predicted_sentiment
    }

    return JsonResponse(response_data)
Enter fullscreen mode Exit fullscreen mode

Step 3: Configure the URL pattern
In your Django project's urls.py, configure the URL pattern for the sentiment analysis view.

from django.urls import path
from .views import sentiment_analysis

urlpatterns = [
    path('sentiment-analysis/', sentiment_analysis, name='sentiment-analysis'),
]
Enter fullscreen mode Exit fullscreen mode

Step 4: Test the sentiment analysis endpoint
You can now test the sentiment analysis endpoint by making a request to http://localhost:8000/sentiment-analysis/ with the text parameter containing the input text.

For example, if you're using curl:

$ curl -X GET "http://localhost:8000/sentiment-analysis/?text=I%20love%20this%20movie"
Enter fullscreen mode Exit fullscreen mode

Output:

{
    "input_text": "I love this movie",
    "predicted_sentiment": "positive"
}
Enter fullscreen mode Exit fullscreen mode

This example demonstrates a basic sentiment analysis pipeline using a tokenizer in Django. It loads a pre-trained tokenizer and model, tokenizes the input text, converts it into an input tensor, makes predictions, and returns the predicted sentiment as a JSON response.

Implement Sentiment analysis using pretrained modal

Step 1: Install the required libraries

pip install transformers torch
Enter fullscreen mode Exit fullscreen mode

Step 2: Create a Django view function to handle the POST request

from transformers import DistilBertForSequenceClassification, Trainer, TrainingArguments
from django.http import JsonResponse

def sentiment_analysis(request):
    if request.method == 'POST':
        # Get the input text from the POST request
        input_text = request.POST.get('text', '')

        # Load the pre-trained model for sentiment analysis
        model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")

        # Tokenize the input text
        tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased")
        inputs = tokenizer(input_text, return_tensors="pt")

        # Configure the Trainer for fine-tuning
        training_args = TrainingArguments(
            output_dir='./results',
            num_train_epochs=1,
            per_device_train_batch_size=16,
            per_device_eval_batch_size=16,
            warmup_steps=500,
            weight_decay=0.01,
            logging_dir='./logs',
            logging_steps=10,
            evaluation_strategy="epoch"
        )

        # Define the Trainer and perform fine-tuning
        trainer = Trainer(
            model=model,
            args=training_args,
            train_dataset=dataset,  # Replace 'dataset' with your labeled dataset
        )
        trainer.train()

        # Predict the sentiment of the input text
        predictions = trainer.predict(inputs)
        predicted_label = predictions.predictions.argmax().item()

        # Map the predicted label to sentiment class
        sentiment_classes = ["negative", "neutral", "positive"]
        predicted_sentiment = sentiment_classes[predicted_label]

        # Return the predicted sentiment as JSON response
        return JsonResponse({'sentiment': predicted_sentiment})
Enter fullscreen mode Exit fullscreen mode

Step 3: Define a URL pattern in your Django project's urls.py file to map the view function

from django.urls import path
from .views import sentiment_analysis

urlpatterns = [
    path('sentiment/', sentiment_analysis, name='sentiment'),
]
Enter fullscreen mode Exit fullscreen mode

Step 4: Start the Django development server and send a POST request to the /sentiment/ endpoint with the input text as the payload. You can use tools like Postman or cURL for this purpose.

Example output:

Input text: "I really enjoyed the movie!"
Enter fullscreen mode Exit fullscreen mode

Output JSON response:

{
    "sentiment": "positive"
}
Enter fullscreen mode Exit fullscreen mode

In this example, the fine-tuned sentiment analysis model predicts a positive sentiment for the input text.

Second Way

from django.http import JsonResponse
from django.views.decorators.csrf import csrf_exempt
from transformers import AutoModelForSequenceClassification, AutoTokenizer, Trainer, TrainingArguments

# Load the pre-trained model and tokenizer
model_name = "distilbert-base-uncased"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Define the training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=1,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10,
    evaluation_strategy="epoch"
)

# Create the sentiment analysis function
def sentiment_analysis(request):
    if request.method == "POST":
        # Get the input text from the request
        input_text = request.POST.get("text", "")

        # Tokenize the input text
        inputs = tokenizer(input_text, return_tensors="pt")

        # Create the Trainer object
        trainer = Trainer(
            model=model,
            args=training_args
        )

        # Perform sentiment analysis
        output = trainer.predict(inputs)

        # Get the predicted label
        predicted_label = output.predictions.argmax().item()
        label_list = ["negative", "neutral", "positive"]
        predicted_text = label_list[predicted_label]

        # Prepare the response
        response = {
            "input_text": input_text,
            "predicted_label": predicted_text
        }

        return JsonResponse(response)

    return JsonResponse({"error": "Invalid request method."})
Enter fullscreen mode Exit fullscreen mode

Apply the csrf_exempt decorator to the sentiment_analysis view

sentiment_analysis = csrf_exempt(sentiment_analysis)
In the above code, we start by importing the necessary modules and classes from Django and the transformers library. We then load the pre-trained model and tokenizer using the AutoModelForSequenceClassification and AutoTokenizer classes, respectively.

Next, we define the training arguments, specifying the output directory, number of training epochs, batch sizes, warmup steps, weight decay, logging settings, and evaluation strategy.

We create a function called sentiment_analysis, which serves as the view to handle the POST request containing the input text. Inside this function, we extract the input text from the request and tokenize it using the tokenizer.

Then, we create a Trainer object with the loaded model and training arguments. We use the predict method of the Trainer to obtain the sentiment analysis output for the input text.

Finally, we retrieve the predicted label from the output and prepare the response, which includes the input text and the predicted sentiment label. We return the response as a JSON object using the JsonResponse function.

Another Method

from django.http import JsonResponse
from django.views.decorators.csrf import csrf_exempt
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments

# Define the path to your pretrained model and tokenizer
model_name = "distilbert-base-uncased"
model_path = "/path/to/your/model"
tokenizer_path = "/path/to/your/tokenizer"

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(tokenizer_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path)

@csrf_exempt
def sentiment_analysis(request):
    if request.method == 'POST':
        input_text = request.POST.get('text', '')

        # Tokenize the input text
        inputs = tokenizer.encode_plus(
            input_text,
            add_special_tokens=True,
            return_tensors="pt",
            padding=True,
            truncation=True,
            max_length=512
        )

        # Perform sentiment analysis
        outputs = model(**inputs)
        predicted_label = outputs.logits.argmax().item()
        predicted_sentiment = "positive" if predicted_label == 1 else "negative"

        # Return the result as a JSON response
        response = {
            'input_text': input_text,
            'predicted_sentiment': predicted_sentiment
        }
        return JsonResponse(response)
    else:
        return JsonResponse({'error': 'Invalid request method'})
Enter fullscreen mode Exit fullscreen mode

In this example, the code assumes that you have a Django view function called sentiment_analysis which handles the sentiment analysis task. It expects a POST request with a 'text' parameter containing the input text for sentiment analysis.

The code uses the AutoTokenizer and AutoModelForSequenceClassification classes from the Transformers library to load the pretrained tokenizer and model. You should replace model_path and tokenizer_path with the paths to your specific pretrained model and tokenizer.

Inside the sentiment_analysis function, the input text is tokenized using the tokenizer's encode_plus method. The encoded inputs are then passed to the model to obtain the predicted sentiment label. The predicted label is converted to a human-readable sentiment ('positive' or 'negative').

Top comments (0)