Displaying dialysis centers based on total performance

Almost every exploration of such provider-centered data will include analyzing facilities based on their quality scores. We will dive into that next.

First, let's get a count on the different scores received by the dialysis facilities:

print(df.groupby('Total Performance Score').size())

The output is as follows:

Total Performance Score
10           10
100          30
11            2
12            2
13            1
14            3
15            1
...
95           15
96            2
97           11
98            8
99           12
No Score    276
Length: 95, dtype: int64

One thing to notice is that the Total Performance Score column is in a string rather than integer format, and so to do numerical sorting, we must first convert the column to integer format. Second, after running the preceding code, you'll notice that 276 dialysis facilities have a value of No Score for the Total Performance Score column. These rows must be eliminated before converting to integer format to avoid throwing an error.

In the following code, we first eliminate the No Score rows, and then we use the to_numeric() function of pandas to convert the string column to an integer column:

df_filt= df.loc[df['Total Performance Score'] != 'No Score']
df_filt['Total Performance Score'] = pd.to_numeric(
df_filt['Total Performance Score']
)

Now, we create a new DataFrame that selects just a few columns that we are interested in and sorts them, with the worst performing centers at the top. Such a code block would be helpful for identifying the worst performing dialysis centers, for example. We display the first five results:

df_tps = df_filt[[
'Facility Name',
'State',
'Total Performance Score'
]].sort_values('Total Performance Score')
print(df_tps.head(n=5))

The output is as follows:

                                   Facility Name State  
5622   462320 PRIMARY CHILDREN'S DIALYSIS CENTER    UT   
698              PEDIATRIC DIALYSIS UNIT AT UCSF    CA   
6766                  VITAL LIFE DIALYSIS CENTER    FL   
4635  BELMONT COURT DIALYSIS - DOYLESTOWN CAMPUS    PA   
3763                       WOODMERE DIALYSIS LLC    NY   

      Total Performance Score  
5622                        5  
698                         7  
6766                        8  
4635                        8  
3763                        9

Alternatively, if we wished to analyze the mean total performance of each state's dialysis centers, we could do that by using numpy.mean() in conjunction with groupby():

import numpy as np

df_state_means = df_filt.groupby('State').agg({
'Total Performance Score': np.mean
})
print(df_state_means.sort_values('Total Performance Score', ascending=False))

The output is as follows:

       Total Performance Score
State                         
ID                   73.178571
WY                   71.777778
HI                   70.500000
UT                   70.421053
CO                   70.173333
WA                   70.146067
ME                   70.058824
OR                   70.046154
KS                   69.480769
AZ                   68.905983
...

According to the results of this query, Idaho and Wyoming have the nation's best performing dialysis centers. You could also add a column including the number of dialysis centers in each state using the following modification:

import numpy as np

df_state_means = df_filt.groupby('State').agg({
'Total Performance Score': np.mean,
'State': np.size
})
print(df_state_means.sort_values('Total Performance Score', ascending=False))

The output is as follows:

       Total Performance Score  State
State                                
ID                   73.178571     28
WY                   71.777778      9
HI                   70.500000     26
UT                   70.421053     38
CO                   70.173333     75
WA                   70.146067     89
ME                   70.058824     17
OR                   70.046154     65
KS                   69.480769     52
AZ                   68.905983    117
...

The results indicate that when considering only states that have at least 100 dialysis centers, Arizona has the best total performance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset