30Mar 2017

DATA STREAM CLUSTERING ISSUES AND CHALLENGES-A SURVEY

  • Assistant Professor, Department of CSE, GRIET, Hyderabad.
  • Abstract
  • Keywords
  • References
  • Cite This Article as
  • Corresponding Author

In recent years, advances in both hardware and software technology has allowed us to automatically record transactions and other information everyday at a rapid rate. Huge volumes of web, sensory and transactional data are continuously generated everyday as data streams, which need to be analyzed online as they arrive. Analysis of data streams have been researched extensively because of its emerging, imminent, and broad applications. One of the important method is clustering have been widely studied in the data mining community. Many existing data mining methods cannot be applied directly on streaming data because of the fact that the data needs to be mined in single pass. Furthermore, in data stream processing temporal locality is also quite important, because the essential patterns in the data may change and therefore, the clusters in the past history may no longer remain relevant to the future. In this paper we explore various issues and challenges on clustering data streams.


  1. Bifet,G.Holmes, R.Krikbyand B.Pfahringer, Data Stream Mining -A Practical approach,2011.
  2. Madjid Khalilian , Norwati Mustapha, ? Data Stream Clustering: Challenges and Issues?, Proceedings of the International MultiConference of Engineers and Computer Scientists,2010Vol1,IMECS 2010,March 17-19,2010,HongKong.
  3. Babcock B., Babu S., Datar M., Motwani R., Widom J. (2002). Models and Issues in Data Stream Systems, ACM PODS Conference.
  4. Chakravarthy, Q. C. Jiang, ?Stream Data Processing: A Quality of Service Perspective? 2009.
  5. Domingos P., Hulten G. (2000). Mining High-speed Data Streams. ACM SIGKDD Conference.
  6. Guha, Meyerson, Mishra, Motwani, and O?Callaghan. 2003. Clustering data streams: Theory and practice. IEEE Transactions on Knowledge and Data Engineering 15, 515?528.
  7. Hebrail, ?Data stream management and mining?, Mining Massive Data Sets for Security, F. Fogelman-Souli? et al. (Eds.), IOS Press, 2008.
  8. Koudas and D. Srivastava, ?Data Stream Query Processing, AT&T?, Labs-Research, 2003.
  9. Babcock, S. Babu, M. Datar, R. Motwani, J. Widom, ?Models and Issues in Data Stream Systems?, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 1-16, 2002.
  10. Lindeberg, ?Design, Implementation, and Evaluation of Network Monitoring Tasks for the Borealis Stream Processing Engine?, Master?s Thesis, University of Oslo, May 2007.
  11. callaghan,? N.? Mishra,? A.? Meyerson,? S.? Guha,? and? R.? Motwani,? ?Streaming-Data?? Algorithms?? for?? High-Quality?? Clustering,??? in Proceedings of IEEE International Conference on Data Engineering,
  12. C. Aggarwal, J. Han, J. Wang, and P. S. Yu, ?A framework for clustering evolving data streams,? in Proceedings of the 29th international conference on Very large data bases? -? Volume 29, ser. VLDB ?03. VLDB Endowment, 2003, pp. 81?92.
  13. Kranen,Assent,Baldauf,Seidl, ?Self Adaptive any time clustering?,ICMD, 2009
  14. Cao, M. Estery,? W. Qian, A. Zhou, ?Density-Based Clustering over an Evolving Data Stream with Noise?, SDM, 2006.
  15. C. Aggarwal, J. Han, J. Wang, and P. S. Yu, ?A framework for projected clustering of high dimensional data streams,? in Proceedings of the Thirtieth international conference on Very large data bases? ? Volume 30, ser. VLDB. VLDB Endowment, 2004, pp. 852?863.
  16. Udommanetanakit, T. Rakthanmanon, and K. Waiyamai, ?E-stream: Evolution-based technique for stream clustering,? in Proceedings of the 3rd international conference on Advanced Data Mining and Applications, ser. ADMA ?07. Berl in, Heidelberg: Springer-Verlag, 2007, pp. 605?615.
  17. Meesuksabai, T. Kangkachit, and K. Waiyamai, ?Hue-stream: Evolution-based clustering technique for heterogeneous data streams with uncertainty.? in ADMA (2), ser. Lecture Notes in Computer Science, vol. 7121. Springer, 2011, pp. 27 ?40.
  18. P. Rodrigues, J. a. Gama, and J. Pedroso, ?Hierarchical clustering of time-series data streams,? IEEE Trans. on Knowl. and Data Eng., vol. 20, no. 5, pp. 615?627, May 2008.
  19. Golab? and? M.? T.? Ozsu.? Issues? in? Data? StreamManagement. In SIGMOD Record, Volume 32, Number 2, June 2003.
  20. Dong,? J.? Han,? L.V.S.? Lakshmanan,? J.? Pei,? H.Wang and P.S. Yu.? Online mining of changes from datastreams:? Research? problems? and? preliminary? results,?? In Proceedings? of? the? 2003? ACM? SIGMOD?? Workshop? onManagement? and? Processing? of? Data? Streams.? In cooperation with the 2003 ACM-SIGMOD International Conference? on? Management? of? Data,? San? Diego,? CA, June 8, 2003.
  21. Aggarwal. On Change Dignosis in Evolving Data Streams. InIEEE TKDE, 17(5), 2005.
? ?

[B. Rupa and R. Soujanya. (2017); DATA STREAM CLUSTERING ISSUES AND CHALLENGES-A SURVEY Int. J. of Adv. Res. 5 (Mar). 1644-1649] (ISSN 2320-5407). www.journalijar.com


B.RUPA
ASSISTANT PROFESSOR, DEPARTMENT OF CSE, GRIET,HYDERABAD

DOI:


Article DOI: 10.21474/IJAR01/3673      
DOI URL: https://dx.doi.org/10.21474/IJAR01/3673