TAGS :Viewed: 9 - Published at: a few seconds ago

[ astype does not work for Pandas dataframe ]

I have a dataframe called messages where data looks like

message             length  class
hello, Come here      16     A
hi, how are you       15     A 
what is it            10     B
maybe tomorrow        14     A

When i do

messages.dtypes

It shows me

class      object
message    object
Length      int64
dtype: object

Then I tried converting message column to string type

messages['message'] = messages['message'].astype(str)
print messages.dtypes

It still shows me

class      object
message    object
Length      int64
dtype: object

What am I doing wrong. Why doesn't it convert to string?

Python version 2.7.9 On windows 10
Pandas version 0.15.2

Answer 1


There is no "string" datatype. In pandas, strings are stored as objects.

In numpy, you can have string datatypes, but they're fixed-length, so there's still no "string datatype". There's a datatype for 5-character strings, a datatype for 10-character strings, etc., but no datatype for "strings" per se. Pandas uses object as the datatype for strings so that you can perform size-changing manipulations on the strings (e.g., concatenating them with other strings) without having to recreate the entire column with a new string length.