当前位置: 动力学知识库 > 问答 > 编程问答 >

python - Pandas dataframe manipulation and plotting

问题描述:

Using WinPython 3.4, matplotlib 1.3.1, I'm pulling data for a dataframe from a mysql database. The raw dataframe that I get from the query looks like:

 wafer_number test_type test_pass x_coord y_coord test_el_id wavelength intensity

0 HT2731 T2 1 38 54 24 288.68 4413

1 HT2731 T2 1 40 54 25 257.42 2595

2 HT2731 T2 1 50 54 28 300.00 2836

3 HT2731 T2 1 52 54 29 300.00 2862

4 HT2731 T2 1 54 54 30 300.00 3145

5 HT2731 T2 1 56 54 31 300.00 2804

6 HT2731 T2 1 58 54 32 255.69 2803

7 HT2731 T2 1 59 54 33 257.23 2991

8 HT2731 T2 1 60 54 34 262.45 3946

9 HT2731 T2 1 62 54 35 291.84 9398

10 HT2801 T2 1 38 55 54 288.68 4125

11 HT2801 T2 1 38 56 55 265.25 4258

What I need is to plot wavelength and intensity on the x and y axes respectively with each different wafer number as it's own series. I need to keep the x_coord and y_coord variables so that I can identify standout data points later ideally by clicking on them and adding them to a list. I'll get that working after I get these things plotted.

I thought that using the built-in dataframes plotting capability requires me to perform a pivot_table method

wl_vs_int = results.pivot_table(values='intensity', rows=['x_coord', 'y_coord','wavelength'], cols='wafer_number')

on my dataframe which then turns the dataframe into:

 wafer_number HT2478 HT2625 HT2644 HT2671 HT2673 HT2719 HT2731 HT2796 HT2801

x_coord y_coord wavelength

27 35 289.07 NaN NaN NaN 5137 NaN NaN NaN NaN NaN

36 250.88 4585 NaN NaN NaN NaN NaN NaN NaN NaN

37 260.90 NaN NaN NaN NaN 4270 NaN NaN NaN NaN

38 288.87 NaN NaN NaN 8191 NaN NaN NaN NaN NaN

40 259.74 NaN NaN NaN NaN 17027 NaN NaN NaN NaN

41 259.74 NaN NaN NaN NaN 18742 NaN NaN NaN NaN

42 259.74 NaN NaN NaN NaN 34098 NaN NaN NaN NaN

28 34 268.27 NaN NaN NaN NaN 2080 NaN NaN NaN NaN

38 257.42 7727 NaN NaN NaN NaN NaN NaN NaN NaN

44 260.13 NaN NaN NaN NaN 55329 NaN NaN NaN NaN

but now the index is a multi-index of the x, y coords and the wavelength so when I just try to print the wl vs columns,

plt.scatter(wl_vs_int.wavelength, wl_vs_int.columns)

I get the AttributeError:

AttributeError: 'DataFrame' object has no attribute 'wavelength'

I've tried to reindex the dataframe back to a default index but that still gives me the results that 'DataFrame' object has no 'wavelength' attribute.

There's got to be a better way to either rearrange the dataframe to make this possible through the built-in dataframe plotting capabilities or to plot only select columns vs other columns (with the columns being dynamic). I'm clearly new to python and pandas but I've spent days of time trying to do this in different ways and with no results. Any help would be greatly appreciated. Thanks.

网友答案:

To plot wavelength and intensity on the x and y axes respectively with each different wafer number as it's own series, one can group data wrt wafer_number, and then deal with each group

import pandas as pd
from StringIO import StringIO
import matplotlib.pyplot as plt

data = \
"""wafer_number,test_type,test_pass,x_coord,y_coord,test_el_id,wavelength,intensity
HT2731,T2,1,38,54,24,288.68,4413
HT2731,T2,1,40,54,25,257.42,2595
HT2731,T2,1,50,54,28,300.00,2836
HT2731,T2,1,52,54,29,300.00,2862
HT2731,T2,1,54,54,30,300.00,3145
HT2731,T2,1,56,54,31,300.00,2804
HT2731,T2,1,58,54,32,255.69,2803
HT2731,T2,1,59,54,33,257.23,2991
HT2731,T2,1,60,54,34,262.45,3946
HT2731,T2,1,62,54,35,291.84,9398
HT2801,T2,1,38,55,54,288.68,4125
HT2801,T2,1,38,56,55,265.25,4258"""

df = pd.read_csv(StringIO(data),sep = ',')
dfg = df.groupby('wafer_number')

colors = ('b', 'g', 'r', 'c', 'm', 'y', 'k')
fig, ax = plt.subplots()
for i,k in enumerate(dfg.groups.keys()):
    currentGroup = df.loc[dfg.groups[k]]
    color = colors[i % len(colors)]
    ax.plot(currentGroup['wavelength'].values,currentGroup['intensity'].values,\
            ls='', color = color, label = k, marker = 'o', markersize = 8)
legend = ax.legend(loc='upper center', shadow=True)
plt.xlabel('wavelength')
plt.ylabel('intensity')
plt.show()
分享给朋友:
您可能感兴趣的文章:
随机阅读: