Debug School

rakesh kumar
rakesh kumar

Posted on

List down different way to use sorting and cleaning data in django

Sort data by a single column in ascending order and save the sorted data to a Django model:

import pandas as pd
from myapp.models import MyModel

data = pd.read_csv('data.csv')
sorted_data = data.sort_values('column1', ascending=True)  # Sort data by 'column1' in ascending order
for index, row in sorted_data.iterrows():
    my_model = MyModel(field1=row['column1'], field2=row['column2'])
    my_model.save()
Enter fullscreen mode Exit fullscreen mode

Sort data by multiple columns in different orders and save the sorted data to a Django model:

import pandas as pd
from myapp.models import MyModel

data = pd.read_csv('data.csv')
sorted_data = data.sort_values(by=['column1', 'column2'], ascending=[True, False])  # Sort data by 'column1' in ascending order and 'column2' in descending order
for index, row in sorted_data.iterrows():
    my_model = MyModel(field1=row['column1'], field2=row['column2'])
    my_model.save()
Enter fullscreen mode Exit fullscreen mode

Clean data by removing rows with missing values and save the cleaned data to a Django model:

import pandas as pd
from myapp.models import MyModel

data = pd.read_csv('data.csv')
cleaned_data = data.dropna()  # Remove rows with missing values
for index, row in cleaned_data.iterrows():
    my_model = MyModel(field1=row['column1'], field2=row['column2'])
    my_model.save()
Enter fullscreen mode Exit fullscreen mode

For example, let's assume the original DataFrame data contains the following data:

  column1  column2
0   Value1      1.0
1   Value2      NaN
2   Value3      3.0
3   Value4      4.0
4   Value5      NaN
Enter fullscreen mode Exit fullscreen mode

After executing the code cleaned_data = data.dropna(), the resulting cleaned_data will be:

  column1  column2
0   Value1      1.0
2   Value3      3.0
3   Value4      4.0
Enter fullscreen mode Exit fullscreen mode

Clean data by replacing missing values with a default value and save the cleaned data to a Django model:

import pandas as pd
from myapp.models import MyModel

data = pd.read_csv('data.csv')
cleaned_data = data.fillna('N/A')  # Replace missing values with 'N/A'
for index, row in cleaned_data.iterrows():
    my_model = MyModel(field1=row['column1'], field2=row['column2'])
    my_model.save()
Enter fullscreen mode Exit fullscreen mode

For example, let's assume the original DataFrame data contains the following data:

  column1  column2
0   Value1      1.0
1   Value2      NaN
2   Value3      3.0
3   Value4      4.0
4   Value5      NaN
Enter fullscreen mode Exit fullscreen mode

After executing the code cleaned_data = data.fillna('N/A'), the resulting cleaned_data will be:

  column1 column2
0   Value1       1
1   Value2     N/A
2   Value3       3
3   Value4       4
4   Value5     N/A
Enter fullscreen mode Exit fullscreen mode

Clean data by removing duplicate rows and save the cleaned data to a Django model:

import pandas as pd
from myapp.models import MyModel

data = pd.read_csv('data.csv')
cleaned_data = data.drop_duplicates()  # Remove duplicate rows
for index, row in cleaned_data.iterrows():
    my_model = MyModel(field1=row['column1'], field2=row['column2'])
    my_model.save()
Enter fullscreen mode Exit fullscreen mode

Clean data by replacing specific values with another value and save the cleaned data to a Django model:

import pandas as pd
from myapp.models import MyModel

data = pd.read_csv('data.csv')
cleaned_data = data.replace({'column1': {'Value 1': 'New Value', 'Value 2': 'New Value'}})  # Replace specific values in 'column1'
for index, row in cleaned_data.iterrows():
    my_model = MyModel(field1=row['column1'], field2=row['column2'])
    my_model.save()
Enter fullscreen mode Exit fullscreen mode

Clean data by converting column data types and save the cleaned data to a Django model:

import pandas as pd
from myapp.models import MyModel

data = pd.read_csv('data.csv')
data['column1'] = data['column1'].astype(int)  # Convert 'column1' to integer data type
data['column2'] = data['column2'].astype(float)  # Convert 'column2' to float data type

for index, row in data.iterrows():
    my_model = MyModel(field1=row['column1'], field2=row['column2'])
    my_model.save()
Enter fullscreen mode Exit fullscreen mode

Clean data by removing leading and trailing whitespaces from string columns and save the cleaned data to a Django model:

import pandas as pd
from myapp.models import MyModel

data = pd.read_csv('data.csv')
data['column1'] = data['column1'].str.strip()  # Remove leading and trailing whitespaces from 'column1'
data['column2'] = data['column2'].str.strip()  # Remove leading and trailing whitespaces from 'column2'

for index, row in data.iterrows():
    my_model = MyModel(field1=row['column1'], field2=row['column2'])
    my_model.save()
Enter fullscreen mode Exit fullscreen mode

Clean data by applying custom cleaning functions to specific columns and save the cleaned data to a Django model:

import pandas as pd
from myapp.models import MyModel

data = pd.read_csv('data.csv')

def custom_cleaning_function(value):
    # Define your custom cleaning logic
    # Return the cleaned value

data['column1'] = data['column1'].apply(custom_cleaning_function)  # Apply custom cleaning function to 'column1'
data['column2'] = data['column2'].apply(custom_cleaning_function)  # Apply custom cleaning function to 'column2'

for index, row in data.iterrows():
    my_model = MyModel(field1=row['column1'], field2=row['column2'])
    my_model.save()
Enter fullscreen mode Exit fullscreen mode

In these examples, we import data from a CSV file using pd.read_csv(). Then, we use various pandas functions such as sort_values(), dropna(), fillna(), drop_duplicates(), replace(), astype(), and str.strip() to sort and clean the data. Finally, we save the sorted and cleaned data to Django models by iterating over the resulting DataFrame and creating instances of the Django model.

Top comments (0)