计算智能与机器学习作业1.docx
- 文档编号:11113386
- 上传时间:2023-02-25
- 格式:DOCX
- 页数:15
- 大小:682.70KB
计算智能与机器学习作业1.docx
《计算智能与机器学习作业1.docx》由会员分享,可在线阅读,更多相关《计算智能与机器学习作业1.docx(15页珍藏版)》请在冰豆网上搜索。
计算智能与机器学习作业1
Homework1
Firstpart:
IntroductiontotheWekaExplorer(10marks)
1.StarttheWekaExplorer.
2.Loadthesampleweatherdata.
3.Investigatetheattributesoftheweathersampledata.
4.RunaJ48classifier.
5.Lookatamorecomplexdataset.
AnswerthesequestionsinthefirstpartofyourHomework1Report
a.Fromyourobservationsin“5.b)”predictwhichpairsofattributesseparatethethreeclasseswell.Also,predictwhichpairsarelesseffective.Youcanrestrictyourconsiderationtopairs(x,y)whereneitherxnoryistheclassattributeandwherexandyarenotequal.
Answer:
Pairs(petallength,petalwidth)and(petalwidth,petallength)ofattributesseparatethethreeclasseswell,andpairs(sepallength,sepalwidth)and(sepalwidth,sepallength)ofattributesseparatearelesseffective.
b.Compareyourpredictions(above)withtheJ48treeforthisdata.
Answer:
ThecomparisonofmypredictionswiththeJ48andtheRandomForestisbelow:
1.RandomForest
2.J48
c.Loadweather.nominal.arff.Whatvaluescantheattributetemperaturehave?
Answer:
Hot,mildandcool.
d.Whatistheclassvalueofinstancenumber8intheweatherdata?
Answer:
No.
e.Iniris.arff,whatistherangeofpossiblevaluesoftheattributepetallength?
Answer:
Therangeofpossiblevaluesoftheattributepetallengthis1to6.9.
f.Intheirisdata,howmanynumericandhowmanynominalattributesarethere?
Answer:
Therearefournumericandonenominalattributes.
g.Loadweather.nominal.arff.InthePreprocesstab,clickonChoose.Youwillseeahierarchicalfilterselection.Choosethefilterweka.unsupervised.instance.RemoveWithValues.Afterclickingonthatfilter,onthemainPreprocesspanelclickon“RemoveWithValues”togetapanelwhereyoucansettheattributeandvaluestoremove.Removealltheinstancesforwhichthehumidityattributehasthevaluehigh.(Hint:
humidityisthethirdattribute(3)andtheindexof“high”canbeseenbeforegoingintothefilterselectionbyclickingonhumidityintheAttributesection.)Clickon“Apply”afteryouhavemadethefilterselection.YoucancheckthatyouhavedoneitrightbyclickingonhumidityagainandlookingintheSelectedattributepanel.NowrunJ48withTestoption=“Usetrainingset”andreportthepercentof“CorrectlyClassifiedInstances”.
Answer:
85.7143%.
SecondPart:
ExploringtheWekaExplorer(10marks)
Inthissecondpart,youwillpracticeusingtheWekaExplorer.
1.StarttheWekaExplorerandloadthelabornegotiationsdata(seelab1).
2.RuntheJ48classifierseveraltimes,onceforeachofthefollowingtestoptions,andforeachrecordthepercentageofcorrectlyclassifiedinstances.Thepurposeofthisistogiveyouaninitialunderstandingofthetestoutputdata.Cross-validationwithnfoldsmeansthatthetrainingdataisdividedintonportions,andthealgorithmistrainedwithn-1partsasthetrainingdata,whileonepartisusedtotesttheresultingmodel.Thisprocessisdoneinthenwayspossibleandtheresultsareaveraged.Theoryandpracticesuggestthatn=10foldsisbestingeneral.Testingbyusingthetrainingsetgivesthemostoptimisticaccuracyevaluation,sincethesamedataisusedbothfortrainingandtesting.Percentagesplit66%meansthattwothirdsofthedataareusedfortraining,andonethirdisheldbackfortesting.Inyourreportforthisquestion,showtheresultingaccuracyevaluations,andcommentontherangeofvalues.
a.Cross-validationfolds10
b.Cross-validationfolds5
c.Cross-validationfolds2
d.Percentagesplit66%
e.Percentagesplit33%
f.Usetrainingset
3.Gobacktothepreprocesstab,andunderFilter,clickonChoose.Asanexampleofusingafilter,choosetheunsupervisedattributefilterRemove.ClicktheRemovelinethatappearsnexttoChoosetogetanobjecteditorforthisfilter.ClickonMoretoreadhowtouseit.Removealleven-numberedattributes,andruntheJ48classifieragainwithcross-validationfolds10.Comparethetreeandaccuracywiththetreeandaccuracyyouhadwhenallattributeswereconsidered.
Afterremovingattributes:
Beforeremovingattributes:
4.LoadtheirisdatausingthePreprocesspanel.EvaluateC4.5algorithm(J48)onthisdatausing(a)thetrainingsetand(b)cross-validation.Whatistheestimatedpercentageofcorrectclassificationsfor(a)and(b)?
Whichestimateismorerealistic?
Loadtheirisdata:
(a)thetrainingset
Theestimatedpercentageofcorrectclassificationsfor(a)is98%.
(b)cross-validation(Folds:
10)
Theestimatedpercentageofcorrectclassificationsfor(b)is96%.
(b)ismorerealistic.
5.RightclickonthetreesJ48entryintheresultlistandchooseVisualizeclassifiererrors.Thelittlecrossesindicateinstancescorrectlyclassified,andthesquaresrepresenttheinstancesincorrectlyclassified.Whatcanyousayaboutthelocationoftheerrors?
Answer:
LoadtheirisdatausingthePreprocesspanel.EvaluateC4.5algorithm(J48)onthisdatausing10-folds-cross-validation.
Therearesixincorrectlocationsinthispicture.
(1)ThreeinstancesareincorrectlypredictedasIris-virginca,whichtrulybelongtoclassIris-versicolor.
(2)TwoinstancesareincorrectlypredictedasIris-versicolor,whichtrulybelongtoclassIris-virginca.
(3)OneinstanceisincorrectlypredictedasIris-versicolor,whichtrulybelongstoclassIris-setosa.
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 计算 智能 机器 学习 作业