Practical Based on Pandas Data Structures

Pandas Data Structures
Series and DataFrame To Manage and Analyze Data




Student Exam Scores Analysis
import pandas as pd
# Create a Pandas Series for student names
students = pd.Series(['Alice', 'Bob', 'Charlie', 'David', 'Eva'])
# Create a DataFrame for their scores
scores = pd.DataFrame({
    'Math': [85, 92, 78, 90, 88],
    'Science': [88, 84, 91, 89, 93],
    'English': [80, 79, 85, 94, 92]
})
# Add the student names to the DataFrame
scores['Student'] = students
# Set "Student" as the index
scores.set_index('Student', inplace=True)
# Display the DataFrame
print("Student Exam Scores:")
print(scores)
# Calculate the average score for each student
scores['Average'] = scores.mean(axis=1)
print("\nAverage Scores:")
print(scores[['Average']])
# Find the student with the highest average score
top_student = scores['Average'].idxmax()
top_average = scores['Average'].max()
print(f"\nTop student: {top_student} with an average score of {top_average:.2f}")
# Find the subject with the highest average score
subject_avg = scores.mean(axis=0)
top_subject = subject_avg.idxmax()
print(f"\nSubject with the highest average score: {top_subject} ({subject_avg[top_subject]:.2f})")
# Add a column to indicate if a student passed all subjects (Pass if score >= 50)
scores['Passed All'] = scores[['Math', 'Science', 'English']].apply(lambda row: all(row >= 50), axis=1)
print("\nScores with Pass Status:")
print(scores)
# Save the DataFrame to a CSV file
scores.to_csv('student_scores.csv', index=True)
print("\nData saved to 'student_scores.csv'")

Program Highlights
1. Pandas Series: Used for the student names.
2. Pandas DataFrame: Used for managing and analyzing scores for three subjects.
3. Calculations:
  • Computes average scores for each student.
  • Identifies the student with the highest average.
  • Finds the subject with the highest average score.
4. Logical Operation: Adds a "Passed All" column to indicate if a student passed all subjects.
5. Data Export: Saves the DataFrame to a CSV file for future use.

Sample Output
Student Exam Scores:                  
Student   Math  Science  English                      
Alice          85          88              80
Bob            92          84              79
Charlie      78          91              85
David         90          89              94
Eva             88          93              92

Average Scores:                   
Student    Average     
Alice        84.333333
Bob          85.000000
Charlie    84.666667
David       91.000000
Eva          91.000000

Top student: David with an average score of 91.00
Subject with the highest average score: Science (89.00)
Scores with Pass Status:
                    Math  Science  English    Average  Passed All
Student                                                
Alice             85        88           80        84.333333    True
Bob               92        84           79        85.000000    True
Charlie         78        91           85        84.666667    True
David            90       89            94        91.000000    True
Eva                88       93            92        91.000000    True

Data saved to 'student_scores.csv'
Usage
  • Pandas Series and DataFrame make it easy to store, manipulate, and analyze tabular data.
  • The program handles tasks such as:
                  Computing averages.
                  Sorting and filtering.
                  Adding calculated columns.
  • Exporting to a CSV file is useful for real-world applications where data persistence is needed.
This example is suitable for learning and can be extended for more advanced use cases.

Comments