본문 바로가기

분류 전체보기153

[Pandas] Pandas Cheat Sheet(Handling Missing Data) Handling Missing Data¶ In [1]: import pandas as pd import numpy as np In [2]: df = pd.DataFrame([[np.nan, 3, np.nan, 0], [5, 2, 3, 1], [np.nan, np.nan, np.nan, 8], [4, np.nan, 6, 8]], columns=list('ABCD')) df Out[2]: A B C D 0 NaN 3.0 NaN 0 1 5.0 2.0 3.0 1 2 NaN NaN NaN 8 3 4.0 NaN 6.0 8 isna(), notna(): 결측치 여부 확인¶ In [3]: df.isna().sum() Out[3]: A 2 B 2 C 2 D 0 dtype: int64 is.na는 결측치의 .. 2021. 6. 21.
[Pandas] Pandas Cheat Sheet(Summarize Data) Summarize Data¶ In [1]: import pandas as pd import seaborn as sns import numpy as np In [2]: df = sns.load_dataset('iris') df.head() Out[2]: sepal_length sepal_width petal_length petal_width species 0 5.1 3.5 1.4 0.2 setosa 1 4.9 3.0 1.4 0.2 setosa 2 4.7 3.2 1.3 0.2 setosa 3 4.6 3.1 1.5 0.2 setosa 4 5.0 3.6 1.4 0.2 setosa value_counts¶ In [3]: df['species'].value_counts() Out[3]:.. 2021. 6. 20.
[Pandas] Pandas Cheat Sheet(Subset Variables (Columns)) Subset Variables (Columns)¶ In [1]: import pandas as pd import seaborn as sns In [2]: df = sns.load_dataset("iris") df.head() Out[2]: sepal_length sepal_width petal_length petal_width species 0 5.1 3.5 1.4 0.2 setosa 1 4.9 3.0 1.4 0.2 setosa 2 4.7 3.2 1.3 0.2 setosa 3 4.6 3.1 1.5 0.2 setosa 4 5.0 3.6 1.4 0.2 setosa 원하는 column 여러개를 출력합니다.¶ In [3]: df[['sepal_length','sepal_width',.. 2021. 6. 20.
[Pandas] Pandas Cheat Sheet(Subset Observations(Rows)) Subset Observations(Rows)¶ In [1]: import pandas as pd In [2]: df = pd.DataFrame( {"a" : [4 ,5, 6, 4, 2, 4, 5, 7], "b" : [7, 8, 9, 7, 10, 3, 2, 1], "c" : [10, 11, 12, 10, 11, 14, 3, 10]}) In [3]: df Out[3]: a b c 0 4 7 10 1 5 8 11 2 6 9 12 3 4 7 10 4 2 10 11 5 4 3 14 6 5 2 3 7 7 1 10 a행에 5이상인 값을 출력합니다.¶ In [4]: df[df.a > 5] Out[4]: a b c 2 6 9 12 7 7 1 10 a행이 7이 아닌값을 출력합니다.¶ In [5]: df[df["a"] !.. 2021. 5. 23.