Customer profiling

One of the interesting tasks in unsupervised learning is the profiling of customers or clustering of customers. Given one dataset of customer information, one wants to find groups of customers that either share similar characteristics or buy the same products. This task results in a number of benefits for business owners because they are provided the information regarding the groups of customers that they have, whereby therefore enabling a more strategic customer relationship.

Preprocessing data

Customer information can contain both numerical and categorical data. Whenever we face a categorical unscaled variable, we need to split it into the number of values that the variable may take. For example, let's suppose that we have the following transaction list of customer purchases:

Transaction ID

Customer ID

Products

Discount

Total

1399

56

Milk, Bread, Butter

0.00

4.30

1400

991

Cheese, Milk

2.30

5.60

1401

406

Bread, Sausage

0.00

8.80

1402

239

Chipotle Sauce, Spice

0.00

6.70

1403

33

Turkey

0.00

4.50

1404

406

Turkey, Butter, Spice

1.00

9.00

It can be easily seen that the products is unscaled categorical data, and for each transaction, there is an undefined number of products purchased, that is, the customer may purchase only one or several units of these products. In order to transform this dataset into a numerical dataset, one needs to apply preprocessing. For each product, there will be a variable added to the dataset, resulting in the following:

Cust. ID

Milk

Bread

Butter

Cheese

Sausage

Chipotle Sauce

Spice

Turkey

56

1

1

1

0

0

0

0

0

991

1

0

0

1

0

0

0

0

406

0

1

1

0

1

0

1

1

239

0

0

0

0

0

1

1

0

33

0

0

0

0

0

0

0

1

In order to save space, we ignored the numerical variables and considered the presence of the product purchased by a client as 1 and the absence as 0. Alternative preprocessing may consider the number of occurrences of a value, therefore no longer remaining binary, but becoming discrete.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset