8Data Warehouse and OLAP.docx
- 文档编号:28195429
- 上传时间:2023-07-09
- 格式:DOCX
- 页数:14
- 大小:25.54KB
8Data Warehouse and OLAP.docx
《8Data Warehouse and OLAP.docx》由会员分享,可在线阅读,更多相关《8Data Warehouse and OLAP.docx(14页珍藏版)》请在冰豆网上搜索。
8DataWarehouseandOLAP
Datawarehousesgeneralizeandconsolidatedatainmultidimensionalspace.Theconstructionofdatawarehousesinvolvesdatacleaning,dataintegration,anddatatransformationandcanbeviewedasanimportantpreprocessingstepfordatamining.
Moreover,datawarehousesprovideon-lineanalyticalprocessing(OLAP)toolsfortheinteractiveanalysisofmultidimensionaldataofvariedgranularities,whichfacilitateseffectivedatageneralizationanddatamining.
Manyotherdataminingfunctions,suchasassociation,classification,prediction,andclustering,canbeintegratedwithOLAPoperationstoenhanceinteractiveminingofknowledgeatmultiplelevelsofabstraction.
Hence,thedatawarehousehasbecomeanincreasinglyimportantplatformfordataanalysisandon-lineanalyticalprocessingandwillprovideaneffectiveplatformfordatamining.Therefore,datawarehousingandOLAPformanessentialstepintheknowledgediscoveryprocess.
WhatIsaDataWarehouse?
Datawarehousingprovidesarchitecturesandtoolsforbusinessexecutivestosystematicallyorganize,understand,andusetheirdatatomakestrategicdecisions.Datawarehousesystemsarevaluabletoolsintoday’scompetitive,fast-evolvingworld.
Inthelastseveralyears,manyfirmshavespentmillionsofdollarsinbuildingenterprise-widedatawarehouses.Manypeoplefeelthatwithcompetitionmountingineveryindustry,datawarehousingisthelatestmust-havemarketingweapon—awaytoretaincustomersbylearningmoreabouttheirneeds.
“Then,whatexactlyisadatawarehouse?
”Datawarehouseshavebeendefinedinmanyways,makingitdifficulttoformulatearigorousdefinition.
Looselyspeaking,adatawarehousereferstoadatabasethatismaintainedseparatelyfromanorganization’soperationaldatabases.
Datawarehousesystemsallowfortheintegrationofavarietyofapplicationsystems.Theysupportinformationprocessingbyprovidingasolidplatformofconsolidatedhistoricaldataforanalysis.
AccordingtoWilliamH.Inmon,aleadingarchitectintheconstructionofdatawarehousesystems,“Adatawarehouseisasubject-oriented,integrated,time-variant,andnonvolatilecollectionofdatainsupportofmanagement’sdecisionmakingprocess”.Thisshort,butcomprehensivedefinitionpresentsthemajorfeaturesofadatawarehouse.
Thefourkeywords,subject-oriented,integrated,time-variant,andnonvolatile,distinguishdatawarehousesfromotherdatarepositorysystems,suchasrelationaldatabasesystems,transactionprocessingsystems,andfilesystems.Let’stakeacloserlookateachofthesekeyfeatures.
Subject-oriented:
Adatawarehouseisorganizedaroundmajorsubjects,suchascustomer,supplier,product,andsales.Ratherthanconcentratingontheday-to-dayoperationsandtransactionprocessingofanorganization,adatawarehousefocusesonthemodelingandanalysisofdatafordecisionmakers.Hence,datawarehousestypicallyprovideasimpleandconciseviewaroundparticularsubjectissuesbyexcludingdatathatarenotusefulinthedecisionsupportprocess.
Integrated:
Adatawarehouseisusuallyconstructedbyintegratingmultipleheterogeneoussources,suchasrelationaldatabases,flatfiles,andon-linetransactionrecords.Datacleaninganddataintegrationtechniquesareappliedtoensureconsistencyinnamingconventions,encodingstructures,attributemeasures,andsoon.
Time-variant:
Dataarestoredtoprovideinformationfromahistoricalperspective(e.g.,thepast5–10years).Everykeystructureinthedatawarehousecontains,eitherimplicitlyorexplicitly,anelementoftime.
Nonvolatile:
Adatawarehouseisalwaysaphysicallyseparatestoreofdatatransformedfromtheapplicationdatafoundintheoperationalenvironment.Duetothisseparation,adatawarehousedoesnotrequiretransactionprocessing,recovery,andconcurrencycontrolmechanisms.Itusuallyrequiresonlytwooperationsindataaccessing:
initialloadingofdataandaccessofdata.
Insum,adatawarehouseisasemanticallyconsistentdatastorethatservesasaphysicalimplementationofadecisionsupportdatamodelandstorestheinformationonwhichanenterpriseneedstomakestrategicdecisions.Adatawarehouseisalsooftenviewedasanarchitecture,constructedbyintegratingdatafrommultipleheterogeneoussourcestosupportstructuredand/oradhocqueries,analyticalreporting,anddecisionmaking.
Basedonthisinformation,weviewdatawarehousingastheprocessofconstructingandusingdatawarehouses.Theconstructionofadatawarehouserequiresdatacleaning,dataintegration,anddataconsolidation.Theutilizationofadatawarehouseoftennecessitatesacollectionofdecisionsupporttechnologies.
Thisallows“knowledgeworkers”(e.g.,managers,analysts,andexecutives)tousethewarehousetoquicklyandconvenientlyobtainanoverviewofthedata,andtomakesounddecisionsbasedoninformationinthewarehouse.Someauthorsusetheterm“datawarehousing”toreferonlytotheprocessofdatawarehouseconstruction,whiletheterm“warehouseDBMS”isusedtorefertothemanagementandutilizationofdatawarehouses.
“Howareorganizationsusingtheinformationfromdatawarehouses?
”Manyorganizationsusethisinformationtosupportbusinessdecision-makingactivities,including
(1)increasingcustomerfocus,whichincludestheanalysisofcustomerbuyingpatterns(suchasbuyingpreference,buyingtime,budgetcycles,andappetitesforspending);
(2)repositioningproductsandmanagingproductportfoliosbycomparingtheperformanceofsalesbyquarter,byyear,andbygeographicregionsinordertofine-tuneproductionstrategies;(3)analyzingoperationsandlookingforsourcesofprofit;and(4)managingthecustomerrelationships,makingenvironmentalcorrections,andmanagingthecostofcorporateassets.
Datawarehousingisalsoveryusefulfromthepointofviewofheterogeneousdatabaseintegration.Manyorganizationstypicallycollectdiversekindsofdataandmaintainlargedatabasesfrommultiple,heterogeneous,autonomous,anddistributedinformationsources.Tointegratesuchdata,andprovideeasyandefficientaccesstoit,ishighlydesirable,yetchallenging.Muchefforthasbeenspentinthedatabaseindustryandresearchcommunitytowardachievingthisgoal.
Thetraditionaldatabaseapproachtoheterogeneousdatabaseintegrationistobuildwrappersandintegrators(ormediators),ontopofmultiple,heterogeneousdatabases.Whenaqueryisposedtoaclientsite,ametadatadictionaryisusedtotranslatethequeryintoqueriesappropriatefortheindividualheterogeneoussitesinvolved.Thesequeriesarethenmappedandsenttolocalqueryprocessors.
Theresultsreturnedfromthedifferentsitesareintegratedintoaglobalanswerset.Thisquery-drivenapproachrequirescomplexinformationfilteringandintegrationprocesses,andcompetesforresourceswithprocessingatlocalsources.Itisinefficientandpotentiallyexpensiveforfrequentqueries,especiallyforqueriesrequiringaggregations.
Datawarehousingprovidesaninterestingalternativetothetraditionalapproachofheterogeneousdatabaseintegrationdescribedabove.Ratherthanusingaquery-drivenapproach,datawarehousingemploysanupdate-drivenapproachinwhichinformationfrommultiple,heterogeneoussourcesisintegratedinadvanceandstoredinawarehousefordirectqueryingandanalysis.Unlikeon-linetransactionprocessingdatabases,datawarehousesdonotcontainthemostcurrentinformation.
However,adatawarehousebringshighperformancetotheintegratedheterogeneousdatabasesystembecausedataarecopied,preprocessed,integrated,annotated,summarized,andrestructuredintoonesemanticdatastore.Furthermore,queryprocessingindatawarehousesdoesnotinterferewiththeprocessingatlocalsources.Moreover,datawarehousescanstoreandintegratehistoricalinformationandsupportcomplexmultidimensionalqueries.Asaresult,datawarehousinghasbecomepopularinindustry.
FromDataWarehousingtoDataMining
Datawarehousesanddatamartsareusedinawiderangeofapplications.Businessexecutivesusethedataindatawarehousesanddatamartstoperformdataanalysisandmakestrategicdecisions.
Inmanyfirms,datawarehousesareusedasanintegralpartofaplan-execute-assess“closed-loop”feedbacksystemforenterprisemanagement.Datawarehousesareusedextensivelyinbankingandfinancialservices,consumergoodsandretaildistributionsectors,andcontrolledmanufacturing,suchasdemandbasedproduction.
Typically,thelongeradatawarehousehasbeeninuse,themoreitwillhaveevolved.Thisevolutiontakesplacethroughoutanumberofphases.Initially,thedatawarehouseismainlyusedforgeneratingreportsandansweringpredefinedqueries.Progressively,itisusedtoanalyzesummarizedanddetaileddata,wheretheresultsarepresentedintheformofreportsandcharts.
Later,thedatawarehouseisusedforstrategicpurposes,performingmultidimensionalanalysisandsophisticatedslice-and-diceoperations.Finally,thedatawarehousemaybeemployedforknowledgediscoveryandstrategicdecisionmakingusingdataminingtools.Inthiscontext,thetoolsfordatawarehousingcanbecategorizedintoaccessandretrievaltools,databasereportingtools,dataanalysistools,anddataminingtools.
Businessusersneedtohavethemeanstoknowwhatexistsinthedatawarehouse(throughmetadata),howtoaccessthecontentsofthedatawarehouse,howtoexaminethecontentsusinganalysistools,andhowtopresenttheresultsofsuchanalysis.Therearethreekindsofdatawarehouseapplications:
informationprocessing,analyticalprocessing,anddatamini
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 8Data Warehouse and OLAP Data