my company uses some weird date notation, which has this format: [2 digits week number][2 digit working hours]. Both groups use leading zeros. So the data could like: 0801, 0802, 0901, 0902, 0903, 1001, 1002, 1003
For each of this "dates" there is a scoring. This is just regular floating numbers from 0 to 100.
wxxhxx,scoring 0101,5.3 0102,6.6 0103,6.2
With this data I want to create a scatter plot including a linear regression!
I was able to create this regression using Seaborn (which uses matplotlib). Yet some binning is happening here:
The code I’m using:
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns df = pd.read_csv('test.csv') sns.regplot('wxxhxx', 'scoring', df) plt.show()
sns.stripplot(x='wxxhxx', y='scoring', data=df)
Is there any way to combine the looks of these two methos (regplot and stripplot). I would like to have a regression within the strip plot, OR I would like the to get a equidistant distribution (like in strip plot) of the x values within the regplot, so that the values don’t stack.
Thanks for any advice!