Pandas DataFrame 创建自定义行列索引的对象

举报
千江有水千江月 发表于 2020/12/29 13:13:39 2020/12/29
【摘要】 所属的课程名称及链接[AI基础课程--常用框架工具]环境信息* ModelArts  * Notebook - Multi-Engine 2.0 (python3)    * JupyterLab - Notebook - Conda-python3      * pandas 0.22.0Pandas DataFrame 创建自定义行列索引的对象import pandas as pdimp...

所属的课程名称及链接


环境信息

  • * ModelArts
    •   * Notebook - Multi-Engine 2.0 (python3)
      •     * JupyterLab - Notebook - Conda-python3
        •       * pandas 0.22.0


Pandas DataFrame 创建自定义行列索引的对象

import pandas as pd
import numpy as np

# 生成时间序列
# 行索引 7个
day_seq = pd.date_range('20201229',periods=7)
print("day_seq\n",day_seq)

# 具有标准正态分布
# 7行 4列
data = np.random.randn(7,4)
print("data\n",data)

# 列索引 4个
cols = list("abcd")
print("cols\n",cols)

print(pd.DataFrame(data,index=day_seq,columns=cols))
day_seq
 DatetimeIndex(['2020-12-29', '2020-12-30', '2020-12-31', '2021-01-01',
               '2021-01-02', '2021-01-03', '2021-01-04'],
              dtype='datetime64[ns]', freq='D')
data
 [[-0.39216153 -1.74119599  2.41303649 -1.01261741]
 [ 0.76657191 -0.391622   -0.49027351 -0.33393076]
 [-0.49926821 -1.66764719 -0.03899018 -0.61928199]
 [-0.22637096 -1.07672881  0.1192167  -0.51951244]
 [ 0.84630321  0.33698375  2.2679713  -0.03640361]
 [ 0.42203478 -0.46882363  0.26381046 -1.0218011 ]
 [-1.17612578  2.05126248 -0.67079942  0.248533  ]]
cols
 ['a', 'b', 'c', 'd']
                   a         b         c         d
2020-12-29 -0.392162 -1.741196  2.413036 -1.012617
2020-12-30  0.766572 -0.391622 -0.490274 -0.333931
2020-12-31 -0.499268 -1.667647 -0.038990 -0.619282
2021-01-01 -0.226371 -1.076729  0.119217 -0.519512
2021-01-02  0.846303  0.336984  2.267971 -0.036404
2021-01-03  0.422035 -0.468824  0.263810 -1.021801
2021-01-04 -1.176126  2.051262 -0.670799  0.248533


help

help(pd.date_range)

Help on function date_range in module pandas.core.indexes.datetimes:

date_range(start=None, end=None, periods=None, freq='D', tz=None, normalize=False, name=None, closed=None, **kwargs)
    Return a fixed frequency DatetimeIndex, with day (calendar) as the default
    frequency
    
    Parameters
    ----------
    start : string or datetime-like, default None
        Left bound for generating dates
    end : string or datetime-like, default None
        Right bound for generating dates
    periods : integer, default None
        Number of periods to generate
    freq : string or DateOffset, default 'D' (calendar daily)
        Frequency strings can have multiples, e.g. '5H'
    tz : string, default None
        Time zone name for returning localized DatetimeIndex, for example
        Asia/Hong_Kong
    normalize : bool, default False
        Normalize start/end dates to midnight before generating date range
    name : string, default None
        Name of the resulting DatetimeIndex
    closed : string, default None
        Make the interval closed with respect to the given frequency to
        the 'left', 'right', or both sides (None)
    
    Notes
    -----
    Of the three parameters: ``start``, ``end``, and ``periods``, exactly two
    must be specified.
    
    To learn more about the frequency strings, please see `this link
    <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.
    
    Returns
    -------
    rng : DatetimeIndex
help(np.random.randn)

Help on built-in function randn:

randn(...) method of numpy.random.mtrand.RandomState instance
    randn(d0, d1, ..., dn)
    
    Return a sample (or samples) from the "standard normal" distribution.
    
    .. note::
        This is a convenience function for users porting code from Matlab,
        and wraps `standard_normal`. That function takes a
        tuple to specify the size of the output, which is consistent with
        other NumPy functions like `numpy.zeros` and `numpy.ones`.
    
    .. note::
        New code should use the ``standard_normal`` method of a ``default_rng()``
        instance instead; see `random-quick-start`.
    
    If positive int_like arguments are provided, `randn` generates an array
    of shape ``(d0, d1, ..., dn)``, filled
    with random floats sampled from a univariate "normal" (Gaussian)
    distribution of mean 0 and variance 1. A single float randomly sampled
    from the distribution is returned if no argument is provided.
    
    Parameters
    ----------
    d0, d1, ..., dn : int, optional
        The dimensions of the returned array, must be non-negative.
        If no argument is given a single Python float is returned.
    
    Returns
    -------
    Z : ndarray or float
        A ``(d0, d1, ..., dn)``-shaped array of floating-point samples from
        the standard normal distribution, or a single such float if
        no parameters were supplied.
    
    See Also
    --------
    standard_normal : Similar, but takes a tuple as its argument.
    normal : Also accepts mu and sigma arguments.
    Generator.standard_normal: which should be used for new code.
    
    Notes
    -----
    For random samples from :math:`N(\mu, \sigma^2)`, use:
    
    ``sigma * np.random.randn(...) + mu``
    
    Examples
    --------
    >>> np.random.randn()
    2.1923875335537315  # random
    
    Two-by-four array of samples from N(3, 6.25):
    
    >>> 3 + 2.5 * np.random.randn(2, 4)
    array([[-4.49401501,  4.00950034, -1.81814867,  7.29718677],   # random
           [ 0.39924804,  4.68456316,  4.99394529,  4.84057254]])  # random


备注

1. 感谢老师的教学与课件  
2. 欢迎各位同学一起来交流学习心得^_^  
3. 沙箱实验、认证、论坛和直播,其中包含了许多优质的内容,推荐了解与学习。  


【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。