Reading thousand format numbers as numbers

If a dataset contains a numeric column that has thousand numbers formatted by a comma or any other delimiter, the default data type for such a column is a string or object. The problem is that it is actually a numeric field and it needs to be read as a numeric field to be used further:

pd.read_csv('tmp.txt',sep='|')

We get the following output:

Data with a level column with thousand format numbers
data.level.dtype returns dtype('O')

To overcome this problem, the thousands parameter can be used while reading:

pd.read_csv('tmp.txt',sep='|',thousands=',')
data.level.dtype now returns dtype('int64')
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset