Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory41.3 B

Variable types

Numeric5

Warnings

a has unique values Unique
b has unique values Unique
c has unique values Unique
d has unique values Unique
e has unique values Unique

Reproduction

Analysis started2021-03-08 10:28:28.495818
Analysis finished2021-03-08 10:28:37.743131
Duration9.25 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

a
Real number (ℝ≥0)

UNIQUE

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.503908377
Minimum0.004618840841
Maximum0.9948559816
Zeros0
Zeros (%)0.0%
Memory size928.0 B
2021-03-08T13:28:37.872016image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.004618840841
5-th percentile0.0573866932
Q10.229849404
median0.4927939253
Q30.7958781147
95-th percentile0.9645882588
Maximum0.9948559816
Range0.9902371408
Interquartile range (IQR)0.5660287107

Descriptive statistics

Standard deviation0.3031917027
Coefficient of variation (CV)0.6016802191
Kurtosis-1.316606168
Mean0.503908377
Median Absolute Deviation (MAD)0.2715844108
Skewness-0.03265297869
Sum50.3908377
Variance0.09192520858
MonotocityNot monotonic
2021-03-08T13:28:37.996208image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.4922220131
 
1.0%
0.82592581411
 
1.0%
0.62146274471
 
1.0%
0.49416924191
 
1.0%
0.79972092431
 
1.0%
0.65021115421
 
1.0%
0.49336583761
 
1.0%
0.45581909041
 
1.0%
0.32239393251
 
1.0%
0.27961898591
 
1.0%
Other values (90)90
90.0%
ValueCountFrequency (%)
0.0046188408411
1.0%
0.011325631861
1.0%
0.016395520571
1.0%
0.030967230221
1.0%
0.031289538821
1.0%
ValueCountFrequency (%)
0.99485598161
1.0%
0.979878411
1.0%
0.97939696531
1.0%
0.97364607771
1.0%
0.96903248641
1.0%

b
Real number (ℝ≥0)

UNIQUE

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.502991704
Minimum0.008135430083
Maximum0.9938126032
Zeros0
Zeros (%)0.0%
Memory size928.0 B
2021-03-08T13:28:38.121531image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.008135430083
5-th percentile0.0425373174
Q10.2457610022
median0.4940385167
Q30.8061123021
95-th percentile0.9609461044
Maximum0.9938126032
Range0.9856771731
Interquartile range (IQR)0.5603512999

Descriptive statistics

Standard deviation0.307379033
Coefficient of variation (CV)0.6111015959
Kurtosis-1.268545649
Mean0.502991704
Median Absolute Deviation (MAD)0.3028872137
Skewness0.03064148551
Sum50.2991704
Variance0.09448186994
MonotocityNot monotonic
2021-03-08T13:28:38.243610image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0081354300831
 
1.0%
0.30175281531
 
1.0%
0.79667254521
 
1.0%
0.40214835221
 
1.0%
0.49716587741
 
1.0%
0.16377742921
 
1.0%
0.52815179071
 
1.0%
0.39768599311
 
1.0%
0.042873877631
 
1.0%
0.64849069031
 
1.0%
Other values (90)90
90.0%
ValueCountFrequency (%)
0.0081354300831
1.0%
0.011621526451
1.0%
0.025923621881
1.0%
0.03556377531
1.0%
0.036142673171
1.0%
ValueCountFrequency (%)
0.99381260321
1.0%
0.98114763461
1.0%
0.97645525221
1.0%
0.96389068521
1.0%
0.96139938271
1.0%

c
Real number (ℝ≥0)

UNIQUE

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.533270135
Minimum0.02787588429
Maximum0.9974024938
Zeros0
Zeros (%)0.0%
Memory size928.0 B
2021-03-08T13:28:38.363565image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.02787588429
5-th percentile0.08734609577
Q10.3244676278
median0.5388050814
Q30.7367601396
95-th percentile0.9826822791
Maximum0.9974024938
Range0.9695266095
Interquartile range (IQR)0.4122925118

Descriptive statistics

Standard deviation0.2752608518
Coefficient of variation (CV)0.5161752622
Kurtosis-0.9748851823
Mean0.533270135
Median Absolute Deviation (MAD)0.2018406793
Skewness-0.08754946889
Sum53.3270135
Variance0.07576853652
MonotocityNot monotonic
2021-03-08T13:28:38.485134image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.17692726141
 
1.0%
0.74646861281
 
1.0%
0.91400328231
 
1.0%
0.71157441521
 
1.0%
0.92627246511
 
1.0%
0.32580106311
 
1.0%
0.94911919351
 
1.0%
0.7057633691
 
1.0%
0.63413184041
 
1.0%
0.14014175881
 
1.0%
Other values (90)90
90.0%
ValueCountFrequency (%)
0.027875884291
1.0%
0.030996097691
1.0%
0.05955283351
1.0%
0.063838808561
1.0%
0.076570825451
1.0%
ValueCountFrequency (%)
0.99740249381
1.0%
0.98957136071
1.0%
0.98770311921
1.0%
0.98647035031
1.0%
0.98319475111
1.0%

d
Real number (ℝ≥0)

UNIQUE

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5226326021
Minimum0.001663493784
Maximum0.9768211003
Zeros0
Zeros (%)0.0%
Memory size928.0 B
2021-03-08T13:28:38.606612image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.001663493784
5-th percentile0.01665174555
Q10.2609049967
median0.527300767
Q30.7958739524
95-th percentile0.949768511
Maximum0.9768211003
Range0.9751576065
Interquartile range (IQR)0.5349689558

Descriptive statistics

Standard deviation0.3048124497
Coefficient of variation (CV)0.5832250963
Kurtosis-1.175158771
Mean0.5226326021
Median Absolute Deviation (MAD)0.2697336556
Skewness-0.170068292
Sum52.26326021
Variance0.09291062949
MonotocityNot monotonic
2021-03-08T13:28:38.727699image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.43714404031
 
1.0%
0.24933708671
 
1.0%
0.81439737251
 
1.0%
0.67877835011
 
1.0%
0.24019211991
 
1.0%
0.79393437091
 
1.0%
0.83662013161
 
1.0%
0.54018152021
 
1.0%
0.02849338811
 
1.0%
0.97682110031
 
1.0%
Other values (90)90
90.0%
ValueCountFrequency (%)
0.0016634937841
1.0%
0.0024294157071
1.0%
0.01077560851
1.0%
0.012759791081
1.0%
0.012884921
1.0%
ValueCountFrequency (%)
0.97682110031
1.0%
0.97510807531
1.0%
0.97459317321
1.0%
0.96931607911
1.0%
0.95035537611
1.0%

e
Real number (ℝ≥0)

UNIQUE

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4746799421
Minimum0.01237932359
Maximum0.9946648478
Zeros0
Zeros (%)0.0%
Memory size928.0 B
2021-03-08T13:28:38.854794image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.01237932359
5-th percentile0.05180923674
Q10.2213758844
median0.4563179298
Q30.7060146665
95-th percentile0.9451546058
Maximum0.9946648478
Range0.9822855243
Interquartile range (IQR)0.4846387821

Descriptive statistics

Standard deviation0.2930111619
Coefficient of variation (CV)0.617281532
Kurtosis-1.191685664
Mean0.4746799421
Median Absolute Deviation (MAD)0.2443353164
Skewness0.09865670732
Sum47.46799421
Variance0.085855541
MonotocityNot monotonic
2021-03-08T13:28:38.984943image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.89545789361
 
1.0%
0.97568061511
 
1.0%
0.064398421331
 
1.0%
0.53826059871
 
1.0%
0.37162792431
 
1.0%
0.68874308661
 
1.0%
0.052266146491
 
1.0%
0.39545310551
 
1.0%
0.056756916121
 
1.0%
0.30799287481
 
1.0%
Other values (90)90
90.0%
ValueCountFrequency (%)
0.012379323591
1.0%
0.01947371431
1.0%
0.022939267381
1.0%
0.023512439161
1.0%
0.043127951631
1.0%
ValueCountFrequency (%)
0.99466484781
1.0%
0.97568061511
1.0%
0.96980744441
1.0%
0.95202004671
1.0%
0.95044400341
1.0%

Interactions

2021-03-08T13:28:35.422421image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:35.697307image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:35.779858image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:35.862653image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:35.946082image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.031588image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.113274image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.195935image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.276792image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.359168image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.440957image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.521976image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.603251image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.684729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.766218image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.848427image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:36.930938image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:37.016479image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:37.098128image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-08T13:28:37.180137image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-03-08T13:28:39.087976image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-03-08T13:28:39.327033image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-03-08T13:28:39.435190image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-03-08T13:28:39.542919image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-03-08T13:28:37.353327image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-03-08T13:28:37.593265image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

abcde
00.2174780.0551030.5360730.0168500.059588
10.6164480.1409320.5960310.8442060.459302
20.7419430.2489410.4854070.7993550.723326
30.1263020.1487250.5983030.5117740.112611
40.2308950.1220430.7443510.1820430.950444
50.2654130.0867130.4596920.0396010.788174
60.0046190.0972650.4505410.5100440.944876
70.8619360.6744510.3438000.0024290.285993
80.8224830.5680610.4253100.3517300.702625
90.7945970.0483950.8592470.7815110.895458

Last rows

abcde
900.0587600.2256530.5087170.9497380.149538
910.9736460.9121310.7000250.1014900.270646
920.8935340.2719640.1815760.7136910.392824
930.7758060.0515740.6482720.1494950.395453
940.4087220.4808420.3467530.1952860.591826
950.6513990.3787430.1702110.8802500.673811
960.4558190.8337460.9414730.8327000.077265
970.7999570.8755860.2513820.9160710.744029
980.3308410.6736760.8311340.9768210.530809
990.1131690.9613990.7697190.4750080.643093