Python pandas.DataFrame.to_dict函数方法的使用

Pandas是基于NumPy 的一种工具，该工具是为了解决数据分析任务而创建的。Pandas 纳入了大量库和一些标准的数据模型，提供了高效地操作大型数据集所需的工具。Pandas提供了大量能使我们快速便捷地处理数据的函数和方法。你很快就会发现，它是使Python成为强大而高效的数据分析环境的重要因素之一。本文主要介绍一下Pandas中pandas.DataFrame.to_dict方法的使用。

DataFrame.to_dict(orient='dict', into=) [source]

将DataFrame转换为字典(dict)。

可以使用参数自定义键值对的类型（请参见下文）。

参数：

orient ：str {‘dict’, ‘list’, ‘series’, ‘split’, ‘records’, ‘index’}

确定字典值的类型。

1) ‘dict’ (default) : dict 如同 {column -> {index -> value}}

2) ‘list’ : dict 如同 {column -> [values]}

3) ‘series’ : dict 如同 {column -> Series(values)}

4) ‘split’ : dict 如同 {‘index’ -> [index],

‘columns’ -> [columns],

‘data’ -> [values]}

5) ‘records’ : list 如同 [{column -> value}, … ,

{column -> value}]

6) ‘index’ : dict 如同 {index -> {column -> value}}

缩写是允许的。s表示series，sp表示split。

into ：lass, 默认为 dict

collections.abc在返回值中用于所有映射的映射子类。

可以是您想要的映射类型的实际类或空实例。

如果你想要一个collection .defaultdict，你必须初始化它。

返回值：

dict, list 或 collections.abc.Mapping

返回一个collections.abc。表示DataFrame的映射对象。

最终的转换依赖于orient参数。

例子

>>> df = pd.DataFrame({'col1': [1, 2],...                    'col2': [0.5, 0.75]},...                   index=['row1', 'row2'])>>> df      col1  col2row1     1  0.50row2     2  0.75>>> df.to_dict(){'col1': {'row1': 1, 'row2': 2}, 'col2': {'row1': 0.5, 'row2': 0.75}}

您可以指定返回方向

>>> df.to_dict('series'){'col1': row1    1         row2    2Name: col1, dtype: int64,'col2': row1    0.50        row2    0.75Name: col2, dtype: float64}>>> df.to_dict('split'){'index': ['row1', 'row2'], 'columns': ['col1', 'col2'], 'data': [[1, 0.5], [2, 0.75]]}>>> df.to_dict('records')[{'col1': 1, 'col2': 0.5}, {'col1': 2, 'col2': 0.75}]>>> df.to_dict('index'){'row1': {'col1': 1, 'col2': 0.5}, 'row2': {'col1': 2, 'col2': 0.75}}

您还可以指定映射类型

>>> from collections import OrderedDict, defaultdict>>> df.to_dict(into=OrderedDict)OrderedDict([('col1', OrderedDict([('row1', 1), ('row2', 2)])),             ('col2', OrderedDict([('row1', 0.5), ('row2', 0.75)]))])

如果需要defaultdict，则需要对其进行初始化：

>>> dd = defaultdict(list)>>> df.to_dict('records', into=dd)[defaultdict(, {'col1': 1, 'col2': 0.5}), defaultdict(, {'col1': 2, 'col2': 0.75})]