咨询个问题，一个 pandas 的 df 数据，如何实现数据的筛选？

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

推荐学习书目

› Learn Python the Hard Way

Python Sites

› PyPI - Python Package Index

› http://diveintopython.org/toc/index.html

› Pocoo

值得关注的项目

› PyPy

› Celery

› Jinja2

› Read the Docs

› gevent

› pyenv

› virtualenv

› Stackless Python

› Beautiful Soup

› 结巴中文分词

› Green Unicorn

› Sentry

› Shovel

› Pyflakes

› pytest

Python 编程

› pep8 Checker

Styles

› PEP 8

› Google Python Style Guide

› Code Style from The Hitchhiker's Guide

这是一个创建于 2389 天前的主题，其中的信息可能已经有所发展或是发生改变。

需求

有一个 df1 的 dataframe 数据，我如何实现数据筛选呢？
比如筛选出来 2018 年的数据行，如果再复杂一点的呢，比如 2018 年价格低于 2500 的行
下面是前 20 行的例子

数据：

>>> df1
         pd_date price
0        2018 年 3 月  1499
1        2018 年 5 月  1398
2     2018 年 3 月 20 日   999
3     2018 年 8 月 29 日  3499
4     2017 年 9 月 30 日  2598
5     2017 年 6 月 16 日  1859
6     2018 年 3 月 28 日  2998
7    2017 年 12 月 29 日  1199
8        2018 年 7 月  3699
9          2017 年  2299
10       2018 年 9 月  2399
11   2017 年 12 月 26 日   880
12   2016 年 11 月 14 日  1788
13     2017 年 9 月 1 日   799
14     2018 年 7 月 7 日  1898
15     2018 年 9 月 1 日  2889
16       2018 年 5 月  3499
17      2016 年 11 月  1099
18       2018 年 5 月  3199
19      2017 年 12 月  1499
20       2017 年 9 月  5199

df1

月

筛选

数据

4 条回复 • 2018-10-21 09:18:19 +08:00

wqzjk393

2018-10-18 15:58:25 +08:00

df.where

Goooa

2018-10-18 16:05:15 +08:00

year_list = []
for date in df1['pd_date'].tolist():
date =date[:4]
year_list.append(date)
df1['year'] = pd.Series(year_list)
筛选出来 2018 年的数据行= df1[df1['year']=='2018']
2018 年价格低于 2500 的行=df1[(df1['year']=='2018') & (df1['price']<2500)]

lulu00147

2018-10-18 16:08:45 +08:00 via iPhone

利用 PythonPython 进行数据分析.epub

jswangjieda

2018-10-21 09:18:19 +08:00

df1[(df1['pd_date'].str[:4]=='2018')&(df1['price']<2500)]