Data Loading and Preparation

The initial step involved importing necessary libraries and loading the datasets:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from google.colab import drive

drive.mount('/content/drive')

owners_renters = pd.read_csv('/content/drive/My Drive/fannie mae/owners_renters_cleaned.csv')
labor_unemployment = pd.read_csv('/content/drive/My Drive/fannie mae/labor_unemployment_2023.csv')
population_sex_combined = pd.read_csv('/content/drive/My Drive/fannie mae/population_sex_combined.csv')

Data Merging

To analyze the relationships effectively, the datasets were merged based on the CountyName attribute:

merged_data = pd.merge(population_sex_combined, 
labor_unemployment, on='CountyName', how='inner')

Top Counties Analysis

Population Analysis

The next step was to identify the top 10 counties by population:

top10_population = merged_data.nlargest(10, 'TotalEstimate')

Visualization of Population

A bar plot was created to visualize the top 10 counties by population:

plt.figure(figsize=(10,6))
sns.barplot(x='CountyName', y='TotalEstimate', data=top10_population)
plt.title('Top 10 Counties by Population')
plt.xticks(rotation=45)
plt.show()

image.png

Unemployment Rate Analysis

Similarly, the top 10 counties by unemployment rate were determined:

top10_unemployment = merged_data.nlargest(10, 'UnemploymentRate')

Visualization of Unemployment Rates