Load Titanic data from sklearn library, plot the following with proper legend and axis labels: a. Plot bar chart to show the frequency of survivors and non-survivors for male and female passengers separately b. Draw a scatter plot for any two selected features c. Compare density distribution for features age and passenger fare d. Use a pair plot to show pairwise bivariate distribution

 Load Titanic data from sklearn library, plot the following with proper legend and axis labels:

a. Plot bar chart to show the frequency of survivors and non-survivors for male and

female passengers separately

b. Draw a scatter plot for any two selected features

c. Compare density distribution for features age and passenger fare

d. Use a pair plot to show pairwise bivariate distribution


CODE





import seaborn as sns

import matplotlib.pyplot as plt


# Load the Titanic dataset from seaborn

titanic = sns.load_dataset('titanic')


# Plot bar chart to show the frequency of survivors and non-survivors for male and female passengers separately

plt.figure(figsize=(8, 6))

sns.countplot(x='sex', hue='survived', data=titanic)

plt.xlabel('Sex')

plt.ylabel('Frequency')

plt.title('Survivors vs Non-Survivors by Gender')

plt.legend(title='Survived', labels=['No', 'Yes'])

plt.show()


# Draw a scatter plot for any two selected features

plt.figure(figsize=(8, 6))

sns.scatterplot(x='age', y='fare', data=titanic)

plt.xlabel('Age')

plt.ylabel('Fare')

plt.title('Scatter Plot: Age vs Fare')

plt.show()


# Compare density distribution for features age and passenger fare

plt.figure(figsize=(8, 6))

sns.kdeplot(data=titanic, x='age', label='Age')

sns.kdeplot(data=titanic, x='fare', label='Fare')

plt.xlabel('Value')

plt.ylabel('Density')

plt.title('Density Distribution: Age vs Fare')

plt.legend()

plt.show()


# Use a pair plot to show pairwise bivariate distribution

sns.pairplot(titanic)

plt.show()


Comments

Popular posts from this blog

Load a Pandas dataframe with a selected dataset. Identify and count the missing values in a dataframe. Clean the data after removing noise as follows: a. Drop duplicate rows. b. Detect the outliers and remove the rows having outliers c. Identify the most correlated positively correlated attributes and negatively correlated attributes

The weights of 8 boys in kilograms: 45, 39, 53, 45, 43, 48, 50, 45. Find the median

Download any dataset and do the following: a. Count number of categorical and numeric features b. Remove one correlated attribute (if any) c. Display five-number summary of each attribute and show it visually