Debug School

rakesh kumar
rakesh kumar

Posted on

How to implement fill-mask to predict the word in django

The fill-mask task in NLP refers to the task of predicting a missing word or token in a given sentence. It is a common task used to evaluate the language understanding capabilities of models.

In the fill-mask task, a sentence is provided with one or more masked tokens represented by a special token, usually [MASK]. The goal is to predict the most likely word or token that should replace the masked token(s) based on the context of the sentence.

Here's an example to illustrate the fill-mask task:

Input: "I want to [MASK] a new car."
Enter fullscreen mode Exit fullscreen mode
Output: "I want to buy a new car."
Enter fullscreen mode Exit fullscreen mode

In this example, the word "buy" is the correct prediction for the masked token, based on the context of the sentence.

The fill-mask task can be performed using pre-trained language models, such as BERT or GPT, which have been trained on large amounts of text data. These models have learned to understand the context of words and can generate meaningful predictions for masked tokens.

from transformers import pipeline

unmasker = pipeline("fill-mask")
unmasker("This course will teach you all about <mask> models.", top_k=2)
Enter fullscreen mode Exit fullscreen mode
[{'sequence': 'This course will teach you all about mathematical models.',
  'score': 0.19619831442832947,
  'token': 30412,
  'token_str': ' mathematical'},
 {'sequence': 'This course will teach you all about computational models.',
  'score': 0.04052725434303284,
  'token': 38163,
  'token_str': ' computational'}]
Enter fullscreen mode Exit fullscreen mode

Implement Fill Mask

Here's an example code to perform fill-mask using the Hugging Face transformers library and tokenizer in Django, where the input text is provided dynamically via a POST method:

Step 1: Install the required dependencies

pip install transformers
Enter fullscreen mode Exit fullscreen mode

Step 2: Import the necessary libraries and modules in your Django views.py file

from django.http import JsonResponse
from transformers import pipeline, AutoTokenizer
import torch
Enter fullscreen mode Exit fullscreen mode

Step 3: Create a Django view to handle the POST request

def fill_mask_view(request):
    # Retrieve the input text from the POST request
    input_text = request.POST.get('text')

    # Load the tokenizer
    tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

    # Create the pipeline for fill-mask
    fill_mask = pipeline("fill-mask", model="bert-base-uncased", tokenizer=tokenizer)

    # Perform fill-mask prediction
    results = fill_mask(input_text)

    # Process the results
    predictions = []
    for result in results:
        prediction = {
            "token": result["token_str"],
            "score": result["score"]
        }
        predictions.append(prediction)

    # Return the predictions as a JSON response
    return JsonResponse({"predictions": predictions})
Enter fullscreen mode Exit fullscreen mode

Step 4: Map the Django URL to the view

from django.urls import path

urlpatterns = [
    path('fill-mask/', fill_mask_view, name='fill-mask'),
]
Enter fullscreen mode Exit fullscreen mode

Step 5: Send a POST request to the Django server with the input text

import requests

url = 'http://localhost:8000/fill-mask/'
input_text = 'The [MASK] is blue.'

response = requests.post(url, data={'text': input_text})
predictions = response.json()['predictions']

for prediction in predictions:
    print(f"Token: {prediction['token']}")
    print(f"Score: {prediction['score']}")
    print()
Enter fullscreen mode Exit fullscreen mode

In this example, the Django view fill_mask_view handles the POST request with the input text. It loads the pre-trained BERT model and tokenizer, creates the fill-mask pipeline, and performs the fill-mask prediction on the input text. The results are processed and returned as a JSON response.

When you send a POST request to http://localhost:8000/fill-mask/ with the input text "The [MASK] is blue.", the Django server will respond with a JSON object containing the predicted tokens and their scores. The example code above then prints the predictions to the console.

Note: Make sure you have a Django server running to handle the requests. Adjust the URL and port number in the code according to your Django server configuration.

Output:

Token: sky
Score: 0.36311510276794434

Token: color
Score: 0.12426304870891571

Token: water
Score: 0.07195656740617752

Token: sky's
Score: 0.057355597913980484

Token: ocean
Score: 0.051273450702667236
Enter fullscreen mode Exit fullscreen mode

Top comments (0)