Overview

Brought to you by YData

Dataset statistics

Number of variables5
Number of observations1000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory39.2 KiB
Average record size in memory40.1 B

Variable types

Categorical4
Numeric1

Dataset

DescriptionThis profiling report was generated for the datacamp learning resources.
AuthorJR
URLhttps://data.gov/
Copyright(c) JR_DataCamp, Inc. 2024

Reproduction

Analysis started2024-10-29 11:48:19.635833
Analysis finished2024-10-29 11:48:21.128788
Duration1.49 second
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

Name
Categorical

Distinct20
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Larry Ellison
 
67
Amancio Ortega
 
65
Rob Walton
 
62
Elon Musk
 
58
Michael Bloomberg
 
57
Other values (15)
691 

Length

Max length28
Median length15
Mean length12.776
Min length9

Characters and Unicode

Total characters12776
Distinct characters39
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRob Walton
2nd rowSergey Brin
3rd rowSteve Ballmer
4th rowMukesh Ambani
5th rowJim Walton

Common Values

ValueCountFrequency (%)
Larry Ellison 67
 
6.7%
Amancio Ortega 65
 
6.5%
Rob Walton 62
 
6.2%
Elon Musk 58
 
5.8%
Michael Bloomberg 57
 
5.7%
Alice Walton 54
 
5.4%
Bill Gates 54
 
5.4%
Mukesh Ambani 52
 
5.2%
Charles Koch 51
 
5.1%
Sergey Brin 51
 
5.1%
Other values (10) 429
42.9%

Length

2024-10-29T11:48:21.343530image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
walton 163
 
8.0%
larry 109
 
5.3%
koch 93
 
4.5%
ellison 67
 
3.3%
ortega 65
 
3.2%
amancio 65
 
3.2%
rob 62
 
3.0%
elon 58
 
2.8%
musk 58
 
2.8%
bloomberg 57
 
2.8%
Other values (27) 1248
61.0%

Most occurring characters

ValueCountFrequency (%)
e 1164
 
9.1%
r 1047
 
8.2%
1045
 
8.2%
a 997
 
7.8%
l 894
 
7.0%
o 801
 
6.3%
n 671
 
5.3%
t 590
 
4.6%
i 584
 
4.6%
s 461
 
3.6%
Other values (29) 4522
35.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 12776
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1164
 
9.1%
r 1047
 
8.2%
1045
 
8.2%
a 997
 
7.8%
l 894
 
7.0%
o 801
 
6.3%
n 671
 
5.3%
t 590
 
4.6%
i 584
 
4.6%
s 461
 
3.6%
Other values (29) 4522
35.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 12776
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1164
 
9.1%
r 1047
 
8.2%
1045
 
8.2%
a 997
 
7.8%
l 894
 
7.0%
o 801
 
6.3%
n 671
 
5.3%
t 590
 
4.6%
i 584
 
4.6%
s 461
 
3.6%
Other values (29) 4522
35.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 12776
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1164
 
9.1%
r 1047
 
8.2%
1045
 
8.2%
a 997
 
7.8%
l 894
 
7.0%
o 801
 
6.3%
n 671
 
5.3%
t 590
 
4.6%
i 584
 
4.6%
s 461
 
3.6%
Other values (29) 4522
35.4%

Country
Categorical

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
USA
756 
France
92 
Mexico
 
52
Spain
 
51
India
 
49

Length

Max length6
Median length3
Mean length3.632
Min length3

Characters and Unicode

Total characters3632
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMexico
2nd rowUSA
3rd rowUSA
4th rowUSA
5th rowUSA

Common Values

ValueCountFrequency (%)
USA 756
75.6%
France 92
 
9.2%
Mexico 52
 
5.2%
Spain 51
 
5.1%
India 49
 
4.9%

Length

2024-10-29T11:48:21.663241image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-29T11:48:21.970420image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
usa 756
75.6%
france 92
 
9.2%
mexico 52
 
5.2%
spain 51
 
5.1%
india 49
 
4.9%

Most occurring characters

ValueCountFrequency (%)
S 807
22.2%
U 756
20.8%
A 756
20.8%
a 192
 
5.3%
n 192
 
5.3%
i 152
 
4.2%
c 144
 
4.0%
e 144
 
4.0%
F 92
 
2.5%
r 92
 
2.5%
Other values (6) 305
 
8.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3632
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S 807
22.2%
U 756
20.8%
A 756
20.8%
a 192
 
5.3%
n 192
 
5.3%
i 152
 
4.2%
c 144
 
4.0%
e 144
 
4.0%
F 92
 
2.5%
r 92
 
2.5%
Other values (6) 305
 
8.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3632
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S 807
22.2%
U 756
20.8%
A 756
20.8%
a 192
 
5.3%
n 192
 
5.3%
i 152
 
4.2%
c 144
 
4.0%
e 144
 
4.0%
F 92
 
2.5%
r 92
 
2.5%
Other values (6) 305
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3632
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S 807
22.2%
U 756
20.8%
A 756
20.8%
a 192
 
5.3%
n 192
 
5.3%
i 152
 
4.2%
c 144
 
4.0%
e 144
 
4.0%
F 92
 
2.5%
r 92
 
2.5%
Other values (6) 305
 
8.4%

Industry
Categorical

Distinct10
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Technology
350 
Retail
217 
Manufacturing
94 
Media
57 
Cosmetics
51 
Other values (5)
231 

Length

Max length18
Median length14
Mean length9.376
Min length5

Characters and Unicode

Total characters9376
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFinance
2nd rowAutomotive
3rd rowManufacturing
4th rowTechnology
5th rowFashion

Common Values

ValueCountFrequency (%)
Technology 350
35.0%
Retail 217
21.7%
Manufacturing 94
 
9.4%
Media 57
 
5.7%
Cosmetics 51
 
5.1%
Telecommunications 51
 
5.1%
Finance 50
 
5.0%
Fashion 44
 
4.4%
Automotive 43
 
4.3%
Petrochemicals 43
 
4.3%

Length

2024-10-29T11:48:22.244408image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-29T11:48:22.567156image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
technology 350
35.0%
retail 217
21.7%
manufacturing 94
 
9.4%
media 57
 
5.7%
cosmetics 51
 
5.1%
telecommunications 51
 
5.1%
finance 50
 
5.0%
fashion 44
 
4.4%
automotive 43
 
4.3%
petrochemicals 43
 
4.3%

Most occurring characters

ValueCountFrequency (%)
o 1026
10.9%
e 956
 
10.2%
n 784
 
8.4%
c 733
 
7.8%
i 701
 
7.5%
l 661
 
7.0%
a 650
 
6.9%
t 542
 
5.8%
g 444
 
4.7%
h 437
 
4.7%
Other values (15) 2442
26.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 9376
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 1026
10.9%
e 956
 
10.2%
n 784
 
8.4%
c 733
 
7.8%
i 701
 
7.5%
l 661
 
7.0%
a 650
 
6.9%
t 542
 
5.8%
g 444
 
4.7%
h 437
 
4.7%
Other values (15) 2442
26.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 9376
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 1026
10.9%
e 956
 
10.2%
n 784
 
8.4%
c 733
 
7.8%
i 701
 
7.5%
l 661
 
7.0%
a 650
 
6.9%
t 542
 
5.8%
g 444
 
4.7%
h 437
 
4.7%
Other values (15) 2442
26.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 9376
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 1026
10.9%
e 956
 
10.2%
n 784
 
8.4%
c 733
 
7.8%
i 701
 
7.5%
l 661
 
7.0%
a 650
 
6.9%
t 542
 
5.8%
g 444
 
4.7%
h 437
 
4.7%
Other values (15) 2442
26.0%

Net Worth (in billions)
Real number (ℝ)

Distinct982
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean102.61627
Minimum1.57
Maximum199.24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2024-10-29T11:48:22.915560image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1.57
5-th percentile12.3585
Q154.96
median103.365
Q3151.9125
95-th percentile189.5755
Maximum199.24
Range197.67
Interquartile range (IQR)96.9525

Descriptive statistics

Standard deviation56.796062
Coefficient of variation (CV)0.55348008
Kurtosis-1.1691633
Mean102.61627
Median Absolute Deviation (MAD)48.43
Skewness-0.0083658018
Sum102616.27
Variance3225.7926
MonotonicityNot monotonic
2024-10-29T11:48:23.256590image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
116.42 3
 
0.3%
98.27 2
 
0.2%
192.11 2
 
0.2%
185.43 2
 
0.2%
65.74 2
 
0.2%
40.2 2
 
0.2%
105.84 2
 
0.2%
78.03 2
 
0.2%
179.62 2
 
0.2%
167.65 2
 
0.2%
Other values (972) 979
97.9%
ValueCountFrequency (%)
1.57 1
0.1%
1.86 1
0.1%
2.07 1
0.1%
2.41 1
0.1%
2.49 1
0.1%
2.77 1
0.1%
2.92 1
0.1%
2.99 1
0.1%
3.18 1
0.1%
3.21 1
0.1%
ValueCountFrequency (%)
199.24 1
0.1%
199.21 1
0.1%
199.2 1
0.1%
199.1 1
0.1%
199 1
0.1%
198.77 1
0.1%
198.34 1
0.1%
198.05 2
0.2%
197.6 1
0.1%
197.41 1
0.1%

Company
Categorical

Distinct15
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Walmart
160 
Google
101 
Microsoft
101 
Koch Industries
99 
LVMH
57 
Other values (10)
482 

Length

Max length19
Median length15
Mean length9.021
Min length4

Characters and Unicode

Total characters9021
Distinct characters41
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWalmart
2nd rowGoogle
3rd rowKoch Industries
4th rowGoogle
5th rowWalmart

Common Values

ValueCountFrequency (%)
Walmart 160
16.0%
Google 101
10.1%
Microsoft 101
10.1%
Koch Industries 99
9.9%
LVMH 57
 
5.7%
Reliance Industries 56
 
5.6%
L'Oreal 54
 
5.4%
Zara 51
 
5.1%
Grupo Carso 49
 
4.9%
Facebook 49
 
4.9%
Other values (5) 223
22.3%

Length

2024-10-29T11:48:23.576618image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
walmart 160
 
12.4%
industries 155
 
12.0%
microsoft 101
 
7.8%
google 101
 
7.8%
koch 99
 
7.7%
lvmh 57
 
4.4%
reliance 56
 
4.3%
l'oreal 54
 
4.2%
zara 51
 
4.0%
facebook 49
 
3.8%
Other values (9) 408
31.6%

Most occurring characters

ValueCountFrequency (%)
a 907
 
10.1%
o 822
 
9.1%
r 800
 
8.9%
e 698
 
7.7%
s 553
 
6.1%
l 504
 
5.6%
t 463
 
5.1%
i 359
 
4.0%
c 352
 
3.9%
291
 
3.2%
Other values (31) 3272
36.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 9021
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 907
 
10.1%
o 822
 
9.1%
r 800
 
8.9%
e 698
 
7.7%
s 553
 
6.1%
l 504
 
5.6%
t 463
 
5.1%
i 359
 
4.0%
c 352
 
3.9%
291
 
3.2%
Other values (31) 3272
36.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 9021
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 907
 
10.1%
o 822
 
9.1%
r 800
 
8.9%
e 698
 
7.7%
s 553
 
6.1%
l 504
 
5.6%
t 463
 
5.1%
i 359
 
4.0%
c 352
 
3.9%
291
 
3.2%
Other values (31) 3272
36.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 9021
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 907
 
10.1%
o 822
 
9.1%
r 800
 
8.9%
e 698
 
7.7%
s 553
 
6.1%
l 504
 
5.6%
t 463
 
5.1%
i 359
 
4.0%
c 352
 
3.9%
291
 
3.2%
Other values (31) 3272
36.3%

Interactions

2024-10-29T11:48:20.293102image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2024-10-29T11:48:23.802182image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
CompanyCountryIndustryNameNet Worth (in billions)
Company1.0000.0440.0000.0000.039
Country0.0441.0000.0240.0000.000
Industry0.0000.0241.0000.0000.027
Name0.0000.0000.0001.0000.000
Net Worth (in billions)0.0390.0000.0270.0001.000

Missing values

2024-10-29T11:48:20.710383image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-10-29T11:48:21.016831image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

NameCountryIndustryNet Worth (in billions)Company
0Rob WaltonMexicoFinance8.50Walmart
1Sergey BrinUSAAutomotive44.76Google
2Steve BallmerUSAManufacturing13.43Koch Industries
3Mukesh AmbaniUSATechnology120.44Google
4Jim WaltonUSAFashion122.39Walmart
5Sergey BrinUSATechnology93.19Walmart
6Michael BloombergUSACosmetics117.96Reliance Industries
7Warren BuffettFranceRetail36.62Microsoft
8Carlos SlimUSATechnology97.35Reliance Industries
9Larry PageUSATechnology88.05Walmart
NameCountryIndustryNet Worth (in billions)Company
990Charles KochUSARetail93.70Walmart
991Jim WaltonUSAFashion9.18L'Oreal
992Charles KochIndiaRetail19.53L'Oreal
993Larry EllisonMexicoAutomotive75.21Tesla
994Mark ZuckerbergMexicoFinance87.07Reliance Industries
995Warren BuffettUSARetail142.66Facebook
996Amancio OrtegaUSAMedia166.87Walmart
997Alice WaltonUSARetail30.44Walmart
998Amancio OrtegaSpainRetail163.18Reliance Industries
999Jim WaltonUSARetail186.94Berkshire Hathaway