3-3-2 グループの中での順序を扱う
conn = psycopg2.connect("dbname=BigData host=localhost user=testuser")
df = pd.read_sql("SELECT * FROM popular_products", conn)
print(df)
df['row'] = df['score'].rank(
ascending=False,
method='first'
).astype(int)
df['rank'] = df['score'].rank(
ascending=False,
method='min'
).astype(int)
df['dense_rank'] = df['score'].rank(
ascending=False,
method='dense'
).astype(int)
df['lag1'] = df['product_id'].shift(1)
df['lag2'] = df['product_id'].shift(2)
df['lead1'] = df['product_id'].shift(-1)
df['lead2'] = df['product_id'].shift(-2)
print(df)
ORDER BY句と集約関数を組み合わせる
conn = psycopg2.connect("dbname=BigData host=localhost user=testuser")
df = pd.read_sql("SELECT * FROM popular_products", conn)
print(df)
df['row'] = df['score'].rank(
ascending=False,
method='first'
).astype(int)
df['cum_score'] = df['score'].cumsum()
df['loacl_avg'] =
df['score'].rolling(window=3,center=True,min_periods=1).mean()
df['first_value'] = df['product_id'][df['score'].idxmax()]
df['last_value'] = df['product_id'][df['score'].idxmin()]
print(df)
PATITION BY とORDER BYを組み合わせる
いろいろ試してみたができなかった
3-3-3 縦持ちデータを横もちにする
import pandas as pd
import psycopg2
import numpy as np
conn = psycopg2.connect("dbname=BigData host=localhost user=testuser")
df = pd.read_sql("SELECT * FROM daily_kpi", conn)
print(df)
pd.pivot_table(df,index='dt',columns='indicator')
3-3-4 横もちを縦持ちにする
conn = psycopg2.connect("dbname=BigData host=localhost user=testuser")
df = pd.read_sql("SELECT * FROM quarterly_sales", conn)
print(df)
df.stack()