TAGS :Viewed: 12 - Published at: a few seconds ago

[ using df.apply and str.contains('value', case =False) ]

Data:

A     |    B   |    C    
========================
Value | Fred   |    0
foo   | Jim    |    1
Value | Bob    |    2

I have written a method:

def is_value(df):
    if df['A'].str.contains('value', case=False):
        b='X'
        return b

I call it with:

df['B'] = df.apply(is_value, axis=1)

and get the following error:

AttributeError: ("'str' object has no attribute 'str'", 'occurred at index 0')

Is this allowed in apply?

It works with this idiom:

df = df.loc[df['A'].str.contains('Value', case=False) & df['C'] !=0]
df['A'] = 'X'

Is there a better way?

Answer 1


I think the best is not use apply if it is not necessary, because obviously it is slower.

I think you can use mask:

print df['A'].str.contains('value', case=False)
0     True
1    False
2     True
Name: A, dtype: bool

df['B'] = df.mask(df['A'].str.contains('value', case=False), 'X')
print df
       A    B  C
0  Value    X  0
1    foo  Jim  1
2  Value    X  2

Another solution with loc:

df.loc[ df['A'].str.contains('value', case=False), 'B'] = 'X'
print df
       A    B  C
0  Value    X  0
1    foo  Jim  1
2  Value    X  2

EDIT:

It seems you need add filtering values not equal 0 in column C:

print (df['A'].str.contains('value', case=False)) & (df['C'] !=0)
0    False
1    False
2     True
dtype: bool

df['B'] = df.mask((df['A'].str.contains('value', case=False)) & (df['C'] !=0), 'X')
print df
       A      B  C
0  Value  Value  0
1    foo    foo  1
2  Value      X  2


df.loc[(df['A'].str.contains('value', case=False)) & (df['C'] !=0) , 'B'] = 'X'
print df
       A      B  C
0  Value  Value  0
1    foo    foo  1
2  Value      X  2