1200字范文,内容丰富有趣,写作的好帮手!
1200字范文 > Python数据可视化的例子——热力图(heatmap)

Python数据可视化的例子——热力图(heatmap)

时间:2018-11-13 16:37:56

相关推荐

Python数据可视化的例子——热力图(heatmap)

(关系型数据的可视化)

热力图体现了两个离散变量之间的组合关系

热力图,有时也称之为交叉填充表。该图形最典型的用法就是实现列联表的可视化,即通过图形的方式展现两个离散变量之间的组合关系。读者可以借助于seaborn模块中的heatmap函数,完成热力图的绘制。按照惯例,首先对该函数的用法及参数含义做如下解释:

heatmap(data, vmin=None, vmax=None, cmap=None, center=None, annot=None, fmt='.2g',annot_kws=None, linewidths=0, linecolor='white', cbar=True, cbar_kws = None,square=False, xticklabels='auto', yticklabels='auto', mask=None, ax=None)

data:指定绘制热力图的数据集。vmin,vmax:用于指定图例中最小值与最大值的显示值。cmap:指定一个colormap对象,用于热力图的填充色。(supported values are ‘Accent’, ‘Accent_r’, ‘Blues’, ‘Blues_r’, ‘BrBG’, ‘BrBG_r’, ‘BuGn’, ‘BuGn_r’, ‘BuPu’, ‘BuPu_r’, ‘CMRmap’, ‘CMRmap_r’, ‘Dark2’, ‘Dark2_r’, ‘GnBu’, ‘GnBu_r’, ‘Greens’, ‘Greens_r’, ‘Greys’, ‘Greys_r’, ‘OrRd’, ‘OrRd_r’, ‘Oranges’, ‘Oranges_r’, ‘PRGn’, ‘PRGn_r’, ‘Paired’, ‘Paired_r’, ‘Pastel1’, ‘Pastel1_r’, ‘Pastel2’, ‘Pastel2_r’, ‘PiYG’, ‘PiYG_r’, ‘PuBu’, ‘PuBuGn’, ‘PuBuGn_r’, ‘PuBu_r’, ‘PuOr’, ‘PuOr_r’, ‘PuRd’, ‘PuRd_r’, ‘Purples’, ‘Purples_r’, ‘RdBu’, ‘RdBu_r’, ‘RdGy’, ‘RdGy_r’, ‘RdPu’, ‘RdPu_r’, ‘RdYlBu’, ‘RdYlBu_r’, ‘RdYlGn’, ‘RdYlGn_r’, ‘Reds’, ‘Reds_r’, ‘Set1’, ‘Set1_r’, ‘Set2’, ‘Set2_r’, ‘Set3’, ‘Set3_r’, ‘Spectral’, ‘Spectral_r’, ‘Wistia’, ‘Wistia_r’, ‘YlGn’, ‘YlGnBu’, ‘YlGnBu_r’, ‘YlGn_r’, ‘YlOrBr’, ‘YlOrBr_r’, ‘YlOrRd’, ‘YlOrRd_r’, ‘afmhot’, ‘afmhot_r’, ‘autumn’, ‘autumn_r’, ‘binary’, ‘binary_r’, ‘bone’, ‘bone_r’, ‘brg’, ‘brg_r’, ‘bwr’, ‘bwr_r’, ‘cividis’, ‘cividis_r’, ‘cool’, ‘cool_r’, ‘coolwarm’, ‘coolwarm_r’, ‘copper’, ‘copper_r’, ‘cubehelix’, ‘cubehelix_r’, ‘flag’, ‘flag_r’, ‘gist_earth’, ‘gist_earth_r’, ‘gist_gray’, ‘gist_gray_r’, ‘gist_heat’, ‘gist_heat_r’, ‘gist_ncar’, ‘gist_ncar_r’, ‘gist_rainbow’, ‘gist_rainbow_r’, ‘gist_stern’, ‘gist_stern_r’, ‘gist_yarg’, ‘gist_yarg_r’, ‘gnuplot’, ‘gnuplot2’, ‘gnuplot2_r’, ‘gnuplot_r’, ‘gray’, ‘gray_r’, ‘hot’, ‘hot_r’, ‘hsv’, ‘hsv_r’, ‘icefire’, ‘icefire_r’, ‘inferno’, ‘inferno_r’, ‘jet’, ‘jet_r’, ‘magma’, ‘magma_r’, ‘mako’, ‘mako_r’, ‘nipy_spectral’, ‘nipy_spectral_r’, ‘ocean’, ‘ocean_r’, ‘pink’, ‘pink_r’, ‘plasma’, ‘plasma_r’, ‘prism’, ‘prism_r’, ‘rainbow’, ‘rainbow_r’, ‘rocket’, ‘rocket_r’, ‘seismic’, ‘seismic_r’, ‘spring’, ‘spring_r’, ‘summer’, ‘summer_r’, ‘tab10’, ‘tab10_r’, ‘tab20’, ‘tab20_r’, ‘tab20b’, ‘tab20b_r’, ‘tab20c’, ‘tab20c_r’, ‘terrain’, ‘terrain_r’, ‘turbo’, ‘turbo_r’, ‘twilight’, ‘twilight_r’, ‘twilight_shifted’, ‘twilight_shifted_r’, ‘viridis’, ‘viridis_r’, ‘vlag’, ‘vlag_r’, ‘winter’, ‘winter_r’)center:指定颜色中心值,通过该参数可以调整热力图的颜色深浅。annot:指定一个bool类型的值或与data参数形状一样的数组,如果为True,就在热力图的每个单元上显示数值。fmt:指定单元格中数据的显示格式。annot_kws:有关单元格中数值标签的其他属性描述,如颜色、大小等。linewidths:指定每个单元格的边框宽度。linecolor:指定每个单元格的边框颜色。cbar:bool类型参数,是否用颜色条作为图例,默认为True。square:bool类型参数,是否使热力图的每个单元格为正方形,默认为False。cbar_kws:有关颜色条的其他属性描述。xticklabels,yticklabels:指定热力图x轴和y轴的刻度标签,如果为True,则分别以数据框的变量名和行名称作为刻度标签。mask:用于突出显示某些数据。ax:用于指定子图的位置。

接下来,以某服装店的交易数据为例,统计—每个月的销售总额:

然后运用如上介绍的heatmap函数对统计结果进行可视化展现,具体代码如下:

import pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsimport numpy as np#设置绘图风格plt.style.use('ggplot')#处理中文乱码plt.rcParams['font.sans-serif'] = ['Microsoft YaHei']#坐标轴负号的处理plt.rcParams['axes.unicode_minus']=False# 读取数据Sales = pd.read_excel(r'服装店的交易数据.xlsx')# 根据交易日期,衍生出年份和月份字段Sales['year'] = Sales.Date.dt.yearSales['month'] = Sales.Date.dt.month# 统计每年各月份的销售总额Summary = Sales.pivot_table(index = 'month', columns = 'year', values = 'Sales', aggfunc = np.sum)#打印销售额的列联表格式print(Summary.head(13))# 绘制热力图sns.heatmap(data = Summary, # 指定绘图数据cmap = 'PuBuGn', # 指定填充色linewidths = .1, # 设置每个单元格边框的宽度annot = True, # 显示数值fmt = '.1e' # 以科学计算法显示数据)#添加标题plt.title('每年各月份销售总额热力图')# 显示图形plt.show()

结果:

year month 1520452.5595 334535.0605 255919.2030 341339.24702333909.5565 271881.9480 299890.1410 281270.17903411628.7290 217808.0065 296151.7510 387093.76504406848.7620 266968.5890 290384.4670 278402.99405228025.5680 287796.5150 264673.6260 384588.06156273758.8780 293600.7750 196918.1455 316775.78557412797.4600 240297.1585 287905.1865 275160.04958329754.7150 205789.6440 275211.3295 306671.28359325292.3145 419689.7785 278230.1660 319675.176510347173.8005 368544.9250 305660.4510 351438.092511253867.1960 295010.9555 385452.7300 261206.429012420420.2355 368093.9540 328898.4945 351756.4180

它是列联表的格式,反映的是每年各月份的销售总额。很显然,通过肉眼是无法迅速发现销售业绩在各月份中的差异的,如果将数据表以热力图的形式展现,问题就会简单很多。

如上图呀所示就是将表格进行可视化的结果,每个单元格颜色的深浅代表数值的高低,通过颜色就能迅速发现每年各月份销售情况的好坏。

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。