Python program that demonstrates data loading, storage, and handling file formats using Pandas.
It covers reading from and writing to CSV, Excel, and JSON formats.
Data Loading, Storage, and File Formats
import pandas as pd
# Sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
'Age': [24, 27, 22, 32, 29],
'Department': ['HR', 'IT', 'Finance', 'Marketing', 'IT'],
'Salary': [50000, 60000, 55000, 70000, 65000]
}
# Creating a DataFrame
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# --- Save the DataFrame to various formats ---
# 1. Save to CSV
csv_file = 'employee_data.csv'
df.to_csv(csv_file, index=False)
print(f"\nData saved to CSV file: {csv_file}")
# 2. Save to Excel
excel_file = 'employee_data.xlsx'
df.to_excel(excel_file, index=False, sheet_name='Employees')
print(f"Data saved to Excel file: {excel_file}")
# 3. Save to JSON
json_file = 'employee_data.json'
df.to_json(json_file, orient='records', lines=True)
print(f"Data saved to JSON file: {json_file}")
# --- Load the data back into DataFrames ---
# 1. Load from CSV
df_csv = pd.read_csv(csv_file)
print("\nData loaded from CSV:")
print(df_csv)
# 2. Load from Excel
df_excel = pd.read_excel(excel_file, sheet_name='Employees')
print("\nData loaded from Excel:")
print(df_excel)
# 3. Load from JSON
df_json = pd.read_json(json_file, orient='records', lines=True)
print("\nData loaded from JSON:")
print(df_json)
# --- Additional Analysis ---
# Filter data: Employees with Salary > 60000
high_salary = df_csv[df_csv['Salary'] > 60000]
print("\nEmployees with Salary > 60000:")
print(high_salary)
# Export filtered data to a new CSV file
filtered_csv_file = 'high_salary_employees.csv'
high_salary.to_csv(filtered_csv_file, index=False)
print(f"\nFiltered data saved to CSV file: {filtered_csv_file}")
Program Highlights
1. Data Creation:
- A sample DataFrame is created using a dictionary.
2. File Operations:
- Saves data to CSV, Excel, and JSON formats.
- Loads data back from these formats into DataFrames.
3. Data Analysis:
- Filters employees with a salary greater than 60,000.
- Saves the filtered data to a new CSV file.
Sample Output
Original DataFrame:
Name Age Department Salary
0 Alice 24 HR 50000
1 Bob 27 IT 60000
2 Charlie 22 Finance 55000
3 David 32 Marketing 70000
4 Eva 29 IT 65000
Data saved to CSV file: employee_data.csv
Data saved to Excel file: employee_data.xlsx
Data saved to JSON file: employee_data.json
Data loaded from CSV:
Name Age Department Salary
0 Alice 24 HR 50000
1 Bob 27 IT 60000
2 Charlie 22 Finance 55000
3 David 32 Marketing 70000
4 Eva 29 IT 65000
Data loaded from Excel:
Name Age Department Salary
0 Alice 24 HR 50000
1 Bob 27 IT 60000
2 Charlie 22 Finance 55000
3 David 32 Marketing 70000
4 Eva 29 IT 65000
Data loaded from JSON:
Name Age Department Salary
0 Alice 24 HR 50000
1 Bob 27 IT 60000
2 Charlie 22 Finance 55000
3 David 32 Marketing 70000
4 Eva 29 IT 65000
Employees with Salary > 60000:
Name Age Department Salary
3 David 32 Marketing 70000
4 Eva 29 IT 65000
Filtered data saved to CSV file: high_salary_employees.csv
Features Covered
1. File Formats:
- CSV: Universal and lightweight.
- Excel: Suitable for spreadsheets.
- JSON: Ideal for structured data in web applications.
2. Flexibility:
- Loads data in different formats without changing the program logic.
3. Filter and Export:
- Demonstrates how to analyze and save processed data.
This program can be expanded to handle other formats like SQL databases or additional data analysis tasks.
Comments
Post a Comment