「PythonとJavaScriptではじめるデータビジュアライゼーション」を読む

import pandas as pd
import numpy as np
np.random.seed(0)

9.2データの調査

def reload_data(name='nobel_winners_dirty.json'):
    df = pd.read_json(open('data/' + name))
    return df
df = reload_data()
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1052 entries, 0 to 1051
Data columns (total 12 columns):
born_in           1052 non-null object
category          1052 non-null object
country           1052 non-null object
date_of_birth     1044 non-null object
date_of_death     1044 non-null object
gender            1040 non-null object
link              1052 non-null object
name              1052 non-null object
place_of_birth    1044 non-null object
place_of_death    1044 non-null object
text              1052 non-null object
year              1052 non-null int64
dtypes: int64(1), object(11)
memory usage: 106.8+ KB
df.describe()

describe()は要約統計量を返す

year
count 1052.000000
mean 1968.729087
std 33.155829
min 1809.000000
25% 1947.000000
50% 1975.000000
75% 1996.000000
max 2014.000000
df.describe(include=['object'])
born_in category country date_of_birth date_of_death gender link name place_of_birth place_of_death text
count 1052 1052 1052 1044 1044 1040 1052 1052 1044 1044 1052
unique 40 7 59 853 563 2 893 998 735 410 1043
top Physiology or Medicine United States 7 November 1867 male http://en.wikipedia.org/wiki/Michael_Levitt Felix Bloch Henry Dunant , Peace, 1901
freq 910 250 350 4 362 982 4 2 29 409 2
df.head()
born_in category country date_of_birth date_of_death gender link name place_of_birth place_of_death text year
0 Physiology or Medicine Argentina 8 October 1927 24 March 2002 male http://en.wikipedia.org/wiki/C%C3%A9sar_Milstein César Milstein Bahía Blanca , Argentina Cambridge , England César Milstein , Physiology or Medicine, 1984 1984
1 Bosnia and Herzegovina Literature 9 October 1892 13 March 1975 male http://en.wikipedia.org/wiki/Ivo_Andric Ivo Andric * Dolac (village near Travnik), Austria-Hungary ... Belgrade, SR Serbia, SFR Yugoslavia (present-d... Ivo Andric *, born in then Austria–Hungary ,... 1961
2 Bosnia and Herzegovina Chemistry July 23, 1906 1998-01-07 male http://en.wikipedia.org/wiki/Vladimir_Prelog Vladimir Prelog * Sarajevo , Bosnia and Herzegovina , then part... Zürich , Switzerland Vladimir Prelog *, born in then Austria–Hung... 1975
3 Peace Belgium None None None http://en.wikipedia.org/wiki/Institut_de_Droit... Institut de Droit International None None Institut de Droit International , Peace, 1904 1904
4 Peace Belgium 26 July 1829 6 October 1912 male http://en.wikipedia.org/wiki/Auguste_Marie_Fra... Auguste Beernaert Ostend , Netherlands (now Belgium ) Lucerne , Switzerland Auguste Beernaert , Peace, 1909 1909