Dataset

We have the data in a comma-separated file. The first column is the URL and the second column identifies the label, stating whether the URL is good or bad. The dataset looks as follows:

url,label
diaryofagameaddict.com,bad
espdesign.com.au,bad
iamagameaddict.com,bad
kalantzis.net,bad
slightlyoffcenter.net,bad
toddscarwash.com,bad
tubemoviez.com,bad
ipl.hk,bad
crackspider.us/toolbar/install.php?pack=exe,bad
pos-kupang.com/,bad
rupor.info,bad
svision-online.de/mgfi/administrator/components/com_babackup/classes/fx29id1.txt,bad
officeon.ch.ma/office.js?google_ad_format=728x90_as,bad
sn-gzzx.com,bad
sunlux.net/company/about.html,bad
outporn.com,bad
timothycopus.aimoo.com,bad
xindalawyer.com,bad
freeserials.spb.ru/key/68703.htm,bad
deletespyware-adware.com,bad
orbowlada.strefa.pl/text396.htm,bad
ruiyangcn.com,bad
zkic.com,bad
adserving.favorit-network.com/eas?camp=19320;cre=mu&grpid=1738&tag_id=618&nums=FGApbjFAAA,bad
cracks.vg/d1.php,bad
juicypussyclips.com,bad
nuptialimages.com,bad
andysgame.com,bad
bezproudoff.cz,bad
ceskarepublika.net,bad
hotspot.cz,bad
gmcjjh.org/DHL,bad
nerez-schodiste-zabradli.com,bad
nordiccountry.cz,bad
nowina.info,bad
obada-konstruktiwa.org,bad
otylkaaotesanek.cz,bad
pb-webdesign.net,bad
pension-helene.cz,bad
podzemi.myotis.info,bad
smrcek.com,bad
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset