「PythonとJavaScriptではじめるデータビジュアライゼーション」を読む

jupyter notebookから直接MarkDownファイルに変換してここに貼り付ける方式に変更した

8.4DataFrame

import pandas as pd 
PATH = "/home/beetle/myproject/DataVisualization/nobel_winners/dataviz-with-python-and-js/Ch06_Heavyweight_Scraping_with_Scrapy/winner_list.json"
df = pd.read_json(PATH)
df.head(3)
born_in category country date_of_birth date_of_death gender link name place_of_birth place_of_death text year
0 Economics Austria 8 May 1899 23 March 1992 male http://en.wikipedia.org/wiki/Friedrich_Hayek Friedrich Hayek Vienna Freiburg im Breisgau Friedrich Hayek , Economics, 1974 1974
1 Physiology or Medicine Austria 7 November 1903 27 February 1989 male http://en.wikipedia.org/wiki/Konrad_Lorenz Konrad Lorenz Vienna Vienna Konrad Lorenz , Physiology or Medicine, 1973 1973
2 Austria Physiology or Medicine 20 November 1886 12 June 1982 male http://en.wikipedia.org/wiki/Karl_von_Frisch Karl von Frisch * Vienna Munich Karl von Frisch *, Physiology or Medicine, 1973 1973
8.4.1インデックス
print(df.columns)
print(df.index)
df =df.set_index('name')
print(df.loc['Albert Einstein'])
df = df.reset_index()
Index(['born_in', 'category', 'country', 'date_of_birth', 'date_of_death',
       'gender', 'link', 'name', 'place_of_birth', 'place_of_death', 'text',
       'year'],
      dtype='object')
Int64Index([   0,    1,    2,    3,    4,    5,    6,    7,    8,    9,
            ...
            1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057],
           dtype='int64', length=1058)
                born_in category      country  date_of_birth  date_of_death  \
name                                                                          
Albert Einstein          Physics  Switzerland  14 March 1879  18 April 1955   
Albert Einstein          Physics      Germany  14 March 1879  18 April 1955   

                gender                                          link  \
name                                                                   
Albert Einstein   male  http://en.wikipedia.org/wiki/Albert_Einstein   
Albert Einstein   male  http://en.wikipedia.org/wiki/Albert_Einstein   

                place_of_birth place_of_death  \
name                                            
Albert Einstein            Ulm      Princeton   
Albert Einstein            Ulm      Princeton   

                                                              text  year  
name                                                                      
Albert Einstein  Albert Einstein ,  born in Germany , Physics, ...  1921  
Albert Einstein                    Albert Einstein , Physics, 1921  1921  
8.4.2 行と列
print(df.iloc[2])
print(df.ix[2])
df = df.set_index('name')
print(df.ix['Albert Einstein'])
print(df.ix[2])
name                                            Karl von Frisch *
born_in                                                   Austria
category                                   Physiology or Medicine
country                                                          
date_of_birth                                    20 November 1886
date_of_death                                        12 June 1982
gender                                                       male
link                 http://en.wikipedia.org/wiki/Karl_von_Frisch
place_of_birth                                             Vienna
place_of_death                                             Munich
text              Karl von Frisch *, Physiology or Medicine, 1973
year                                                         1973
Name: 2, dtype: object
name                                            Karl von Frisch *
born_in                                                   Austria
category                                   Physiology or Medicine
country                                                          
date_of_birth                                    20 November 1886
date_of_death                                        12 June 1982
gender                                                       male
link                 http://en.wikipedia.org/wiki/Karl_von_Frisch
place_of_birth                                             Vienna
place_of_death                                             Munich
text              Karl von Frisch *, Physiology or Medicine, 1973
year                                                         1973
Name: 2, dtype: object
                born_in category      country  date_of_birth  date_of_death  \
name                                                                          
Albert Einstein          Physics  Switzerland  14 March 1879  18 April 1955   
Albert Einstein          Physics      Germany  14 March 1879  18 April 1955   

                gender                                          link  \
name                                                                   
Albert Einstein   male  http://en.wikipedia.org/wiki/Albert_Einstein   
Albert Einstein   male  http://en.wikipedia.org/wiki/Albert_Einstein   

                place_of_birth place_of_death  \
name                                            
Albert Einstein            Ulm      Princeton   
Albert Einstein            Ulm      Princeton   

                                                              text  year  
name                                                                      
Albert Einstein  Albert Einstein ,  born in Germany , Physics, ...  1921  
Albert Einstein                    Albert Einstein , Physics, 1921  1921  
born_in                                                   Austria
category                                   Physiology or Medicine
country                                                          
date_of_birth                                    20 November 1886
date_of_death                                        12 June 1982
gender                                                       male
link                 http://en.wikipedia.org/wiki/Karl_von_Frisch
place_of_birth                                             Vienna
place_of_death                                             Munich
text              Karl von Frisch *, Physiology or Medicine, 1973
year                                                         1973
Name: Karl von Frisch *, dtype: object


/home/beetle/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:2: DeprecationWarning: 
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate_ix

/home/beetle/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:4: DeprecationWarning: 
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate_ix
  after removing the cwd from sys.path.
8.4.3グループ
df = df.groupby('category')
print(df.groups.keys())
#物理学受賞者のみピックアップ
phy_group = df.get_group('Physics')
print(phy_group.head(3))
dict_keys(['', 'Chemistry', 'Economics', 'Literature', 'Peace', 'Physics', 'Physiology or Medicine'])
                  born_in category        country     date_of_birth  \
name                                                                  
Brian Schmidt              Physics      Australia  24 February 1967   
Percy W. Bridgman          Physics  United States     21 April 1882   
Isidor Isaac Rabi          Physics  United States      29 July 1898   

                     date_of_death gender  \
name                                        
Brian Schmidt                  NaN   male   
Percy W. Bridgman   20 August 1961   male   
Isidor Isaac Rabi  11 January 1988   male   

                                                             link  \
name                                                                
Brian Schmidt          http://en.wikipedia.org/wiki/Brian_Schmidt   
Percy W. Bridgman  http://en.wikipedia.org/wiki/Percy_W._Bridgman   
Isidor Isaac Rabi  http://en.wikipedia.org/wiki/Isidor_Isaac_Rabi   

                  place_of_birth place_of_death  \
name                                              
Brian Schmidt           Missoula            NaN   
Percy W. Bridgman      Cambridge       Randolph   
Isidor Isaac Rabi        Rymanów  New York City   

                                                                text  year  
name                                                                        
Brian Schmidt      Brian Schmidt ,  born in the United States , P...  2011  
Percy W. Bridgman                  Percy W. Bridgman , Physics, 1946  1946  
Isidor Isaac Rabi  Isidor Isaac Rabi ,  born in Austria , Physics...  1944