データフレームカラムのdtypeを判定し任意のdtype列に絞り込む
メソッド
- pandas.DataFrame.select_dtypes
- データフレームの列のdtypeを指定して列を絞り込む
パラメータ
- include:指定するdtypeの文字列のリスト
- exclude:除外するdtypeの文字列のリスト
注意
- 全数値タイプの指定:np.numberまたは’number’
- 日時の指定:np.datetime64、’datetime’または’datetime64′
- timedeltaの指定:np.timedelta64、’timedelta’、’timedelta64′
データの準備
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import pandas as pd import numpy as np #データを作る data = {'PassengerId': [np.nan, 2, 3, 4, 5], 'Survived': [0, 1, 1, 1, 0], 'Pclass ': [3, 1, np.nan, 1, 3], 'Sex': ['male', 'female', np.nan, 'female', 'male'], 'Age': [22.0, 38.0, np.nan, 35.0, 35.0], 'Embarked': ['S', np.nan, np.nan, 'S', 'S']} df = pd.DataFrame(data) df.head() |
PassengerId | Survived | Pclass | Sex | Age | Embarked | |
---|---|---|---|---|---|---|
0 | NaN | 0 | 3.0 | male | 22.0 | S |
1 | 2.0 | 1 | 1.0 | female | 38.0 | NaN |
2 | 3.0 | 1 | NaN | NaN | NaN | NaN |
3 | 4.0 | 1 | 1.0 | female | 35.0 | S |
4 | 5.0 | 0 | 3.0 | male | 35.0 | S |
pandas.DataFrame.select_dtype
infoを使ってカラムのdtypeを確認する
1 2 3 |
df.info() |
PassengerId 4 non-null float64
Survived 5 non-null int64
Pclass 4 non-null float64
Sex 4 non-null object
Age 4 non-null float64
Embarked 3 non-null object
dtypes: float64(3), int64(1), object(2)
dype = objectのカラムを指定して抽出する
1 2 3 |
df.select_dtypes(include=['object']) |
Sex | Embarked | |
---|---|---|
0 | male | S |
1 | female | NaN |
2 | NaN | NaN |
3 | female | S |
4 | male | S |
dype = float64のカラムを除外して抽出する
1 2 3 |
df.select_dtypes(exclude=['float64']) |
Survived | Sex | Embarked | |
---|---|---|---|
0 | 0 | male | S |
1 | 1 | female | NaN |
2 | 1 | NaN | NaN |
3 | 1 | female | S |
4 | 0 | male | S |