A comprehensive survey of data mining

  • Original Research
  • Published: 06 February 2020
  • Volume 12 , pages 1243–1257, ( 2020 )

Cite this article

data mining research papers free download

  • Manoj Kumar Gupta   ORCID: orcid.org/0000-0002-4481-8432 1 &
  • Pravin Chandra 1  

4984 Accesses

59 Citations

Explore all metrics

Data mining plays an important role in various human activities because it extracts the unknown useful patterns (or knowledge). Due to its capabilities, data mining become an essential task in large number of application domains such as banking, retail, medical, insurance, bioinformatics, etc. To take a holistic view of the research trends in the area of data mining, a comprehensive survey is presented in this paper. This paper presents a systematic and comprehensive survey of various data mining tasks and techniques. Further, various real-life applications of data mining are presented in this paper. The challenges and issues in area of data mining research are also presented in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Similar content being viewed by others

data mining research papers free download

A Review of the Development and Future Trends of Data Mining Tools

data mining research papers free download

A Survey on Big Data, Mining: (Tools, Techniques, Applications and Notable Uses)

data mining research papers free download

Data Mining—A Tool for Handling Huge Voluminous Data

Explore related subjects.

  • Artificial Intelligence

Fayadd U, Piatesky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AAAI Press/The MIT Press, Massachusetts Institute of Technology. ISBN 0–262 56097–6 Fayap

Fayadd U, Piatesky-Shapiro G, Smyth P (1996) Knowledge discovery and data mining: towards a unifying framework. In: Proceedings of the 2nd ACM international conference on knowledge discovery and data mining (KDD), Portland, pp 82–88

Heikki M (1996) Data mining: machine learning, statistics, and databases. In: SSDBM ’96: proceedings of the eighth international conference on scientific and statistical database management, June 1996, pp 2–9

Arora RK, Gupta MK (2017) e-Governance using data warehousing and data mining. Int J Comput Appl 169(8):28–31

Google Scholar  

Morik K, Bhaduri K, Kargupta H (2011) Introduction to data mining for sustainability. Data Min Knowl Discov 24(2):311–324

Han J, Kamber M, Pei J (2012) Data mining concepts and techniques, 3rd edn. Elsevier, Netherlands

MATH   Google Scholar  

Friedman JH (1997) Data mining and statistics: What is the connection? in: Keynote Speech of the 29th Symposium on the Interface: Computing Science and Statistics, Houston, TX, 1997

Turban E, Aronson JE, Liang TP, Sharda R (2007) Decision support and business intelligence systems. 8 th edn, Pearson Education, UK

Gheware SD, Kejkar AS, Tondare SM (2014) Data mining: tasks, tools, techniques and applications. Int J Adv Res Comput Commun Eng 3(10):8095–8098

Kiranmai B, Damodaram A (2014) A review on evaluation measures for data mining tasks. Int J Eng Comput Sci 3(7):7217–7220

Sharma M (2014) Data mining: a literature survey. Int J Emerg Res Manag Technol 3(2):1–4

Venkatadri M, Reddy LC (2011) A review on data mining from past to the future. Int J Comput Appl 15(7):19–22

Chen M, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866–883

Gupta MK, Chandra P (2019) A comparative study of clustering algorithms. In: Proceedings of the 13th INDIACom-2019; IEEE Conference ID: 461816; 6th International Conference on “Computing for Sustainable Global Development”

Ponniah P (2001) Data warehousing fundamentals. Wiley, USA

Chandra P, Gupta MK (2018) Comprehensive survey on data warehousing research. Int J Inform Technol 10(2):217–224

Weiss SH, Indurkhya N (1998) Predictive data mining: a practical guide. Morgan Kaufmann Publishers, San Francisco

Fu Y (1997) Data mining: tasks, techniques, and applications. IEEE Potentials 16(4):18–20

Abuaiadah D (2015) Using bisect k-means clustering technique in the analysis of arabic documents. ACM Trans Asian Low-Resour Lang Inf Process 15(3):1–17

Algergawy A, Mesiti M, Nayak R, Saake G (2011) XML data clustering: an overview. ACM Comput Surv 43(4):1–25

Angiulli F, Fassetti F (2013) Exploiting domain knowledge to detect outliers. Data Min Knowl Discov 28(2):519–568

MathSciNet   MATH   Google Scholar  

Angiulli F, Fassetti F (2016) Toward generalizing the unification with statistical outliers: the gradient outlier factor measure. ACM Trans Knowl Discov Data 10(3):1–26

Bhatnagar V, Ahuja S, Kaur S (2015) Discriminant analysis-based cluster ensemble. Int J Data Min Modell Manag 7(2):83–107

Bouguessa M (2013) Clustering categorical data in projected spaces. Data Min Knowl Discov 29(1):3–38

MathSciNet   Google Scholar  

Campello RJGB, Moulavi D, Zimek A, Sander J (2015) Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans Knowl Discov Data 10(1):1–51

Carpineto C, Osinski S, Romano G, Weiss D (2009) A survey of web clustering engines. ACM Comput. Surv. 41(3):1–38

Ceglar A, Roddick JF (2006) Association mining. ACM Comput Surv 38(2):1–42

Chen YL, Weng CH (2009) Mining fuzzy association rules from questionnaire data. Knowl Based Syst 22(1):46–56

Fan Chin-Yuan, Fan Pei-Shu, Chan Te-Yi, Chang Shu-Hao (2012) Using hybrid data mining and machine learning clustering analysis to predict the turnover rate for technology professionals. Expert Syst Appl 39:8844–8851

Das R, Kalita J, Bhattacharya (2011) A pattern matching approach for clustering gene expression data. Int J Data Min Model Manag 3(2):130–149

Dincer E (2006) The k-means algorithm in data mining and an application in medicine. Kocaeli Univesity, Kocaeli

Geng L, Hamilton HJ (2006) Interestingness measures for data mining: a survey. ACM Comput Surv 38(3):1–32

Gupta MK, Chandra P (2019) P-k-means: k-means using partition based cluster initialization method. In: Proceedings of the international conference on advancements in computing and management (ICACM 2019), Elsevier SSRN, pp 567–573

Gupta MK, Chandra P (2019) An empirical evaluation of k-means clustering algorithm using different distance/similarity metrics. In: Proceedings of the international conference on emerging trends in information technology (ICETIT-2019), emerging trends in information technology, LNEE 605 pp 884–892 DOI: https://doi.org/10.1007/978-3-030-30577-2_79

Hea Z, Xua X, Huangb JZ, Denga S (2004) Mining class outliers: concepts, algorithms and applications in CRM. Expert Syst Appl 27(4):681e97

Hung LN, Thu TNT, Nguyen GC (2015) An efficient algorithm in mining frequent itemsets with weights over data stream using tree data structure. IJ Intell Syst Appl 12:23–31

Hung LN, Thu TNT (2016) Mining frequent itemsets with weights over data stream using inverted matrix. IJ Inf Technol Comput Sci 10:63–71

Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput. Surv 31(3):1–60

Jin H, Wang S, Zhou Q, Li Y (2014) An improved method for density-based clustering. Int J Data Min Model Manag 6(4):347–368

Khandare A, Alvi AS (2017) Performance analysis of improved clustering algorithm on real and synthetic data. IJ Comput Netw Inf Secur 10:57–65

Koh YS, Ravana SD (2016) Unsupervised rare pattern mining: a survey. ACM Trans Knowl Discov Data 10(4):1–29

Kosina P, Gama J (2015) Very fast decision rules for classification in data streams. Data Min Knowl Discov 29(1):168–202

Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. Informatica 31:249–268

Kumar D, Bezdek JC, Rajasegarar S, Palaniswami M, Leckie C, Chan J, Gubbi J (2016) Adaptive cluster tendency visualization and anomaly detection for streaming data. ACM Trans Knowl Discov Data 11(2):1–24

Lee G, Yun U (2017) A new efficient approach for mining uncertain frequent patterns using minimum data structure without false positives. Future Gener Comput Syst 68:89–110

Li G, Zaki MJ (2015) Sampling frequent and minimal boolean patterns: theory and application in classification. Data Min Knowl Discov 30(1):181–225. https://doi.org/10.1007/s10618-015-0409-y

Article   MathSciNet   MATH   Google Scholar  

Liao TW, Triantaphyllou E (2007) Recent advances in data mining of enterprise data: algorithms and applications. World Scientific Publishing, Singapore, pp 111–145

Mabroukeh NR, Ezeife CI (2010) A taxonomy of sequential pattern mining algorithms. ACM Comput Surv 43:1

Mampaey M, Vreeken J (2011) Summarizing categorical data by clustering attributes. Data Min Knowl Discov 26(1):130–173

Menardi G, Torelli N (2012) Training and assessing classification rules with imbalanced data. Data Min Knowl Discov 28(1):4–28. https://doi.org/10.1007/s10618-012-0295-5

Mukhopadhyay A, Maulik U, Bandyopadhyay S (2015) A survey of multiobjective evolutionary clustering. ACM Comput Surv 47(4):1–46

Pei Y, Fern XZ, Tjahja TV, Rosales R (2016) ‘Comparing clustering with pairwise and relative constraints: a unified framework. ACM Trans Knowl Discov Data 11:2

Rafalak M, Deja M, Wierzbicki A, Nielek R, Kakol M (2016) Web content classification using distributions of subjective quality evaluations. ACM Trans Web 10:4

Reddy D, Jana PK (2014) A new clustering algorithm based on Voronoi diagram. Int J Data Min Model Manag 6(1):49–64

Rustogi S, Sharma M, Morwal S (2017) Improved Parallel Apriori Algorithm for Multi-cores. IJ Inf Technol Comput Sci 4:18–23

Shah-Hosseini H (2013) Improving K-means clustering algorithm with the intelligent water drops (IWD) algorithm. Int J Data Min Model Manag 5(4):301–317

Silva JA, Faria ER, Barros RC, Hruschka ER, de Carvalho ACPLF, Gama J (2013) Data stream clustering: a survey. ACM Comput Surv 46(1):1–31

Silva A, Antunes C (2014) Multi-relational pattern mining over data streams. Data Min Knowl Discov 29(6):1783–1814. https://doi.org/10.1007/s10618-014-0394-6

Sim K, Gopalkrishnan V, Zimek A, Cong G (2012) A survey on enhanced subspace clustering. Data Min Knowl Discov 26(2):332–397

Sohrabi MK, Roshani R (2017) Frequent itemset mining using cellular learning automata. Comput Hum Behav 68:244–253

Craw Susan, Wiratunga Nirmalie, Rowe Ray C (2006) Learning adaptation knowledge to improve case-based reasoning. Artif Intell 170:1175–1192

Tan KC, Teoh EJ, Yua Q, Goh KC (2009) A hybrid evolutionary algorithm for attribute selection in data mining. Expert Syst Appl 36(4):8616–8630

Tew C, Giraud-Carrier C, Tanner K, Burton S (2013) Behavior-based clustering and analysis of interestingness measures for association rule mining. Data Min Knowl Discov 28(4):1004–1045

Wang L, Dong M (2015) Exemplar-based low-rank matrix decomposition for data clustering. Data Min Knowl Discov 29:324–357

Wang F, Sun J (2014) Survey on distance metric learning and dimensionality reduction in data mining. Data Min Knowl Discov 29:534–564

Wang B, Rahal I, Dong A (2011) Parallel hierarchical clustering using weighted confidence affinity. Int J Data Min Model Manag 3(2):110–129

Zacharis NZ (2018) Classification and regression trees (CART) for predictive modeling in blended learning. IJ Intell Syst Appl 3:1–9

Zhang W, Li R, Feng D, Chernikov A, Chrisochoides N, Osgood C, Ji S (2015) Evolutionary soft co-clustering: formulations, algorithms, and applications. Data Min Knowl Discov 29:765–791

Han J, Fu Y (1996) Exploration of the power of attribute-oriented induction in data mining. Adv Knowl Discov Data Min. AAAI/MIT Press, pp 399-421

Gupta A, Mumick IS (1995) Maintenance of materialized views: problems, techniques, and applications. IEEE Data Eng Bull 18(2):3

Sawant V, Shah K (2013) A review of distributed data mining using agents. Int J Adv Technol Eng Res 3(5):27–33

Gupta MK, Chandra P (2019) An efficient approach for selection of initial cluster centroids for k-means clustering algorithm. In: Proceedings international conference on recent developments in science engineering and technology (REDSET-2019), November 15–16 2019

Gupta MK, Chandra P (2019) MP-K-means: modified partition based cluster initialization method for k-means algorithm. Int J Recent Technol Eng 8(4):1140–1148

Gupta MK, Chandra P (2019) HYBCIM: hypercube based cluster initialization method for k-means. IJ Innov Technol Explor Eng 8(10):3584–3587. https://doi.org/10.35940/ijitee.j9774.0881019

Article   Google Scholar  

Enke David, Thawornwong Suraphan (2005) The use of data mining and neural networks for forecasting stock market returns. Expert Syst Appl 29:927–940

Mezyk Edward, Unold Olgierd (2011) Machine learning approach to model sport training. Comput Hum Behav 27:1499–1506

Esling P, Agon C (2012) Time-series data mining. ACM Comput Surv 45(1):1–34

Hüllermeier Eyke (2005) Fuzzy methods in machine learning and data mining: status and prospects. Fuzzy Sets Syst 156:387–406

Hullermeier Eyke (2011) Fuzzy sets in machine learning and data mining. Appl Soft Comput 11:1493–1505

Gengshen Du, Ruhe Guenther (2014) Two machine-learning techniques for mining solutions of the ReleasePlanner™ decision support system. Inf Sci 259:474–489

Smith Kate A, Gupta Jatinder ND (2000) Neural networks in business: techniques and applications for the operations researcher. Comput Oper Res 27:1023–1044

Huang Mu-Jung, Tsou Yee-Lin, Lee Show-Chin (2006) Integrating fuzzy data mining and fuzzy artificial neural networks for discovering implicit knowledge. Knowl Based Syst 19:396–403

Padhraic S (2000) Data mining: analysis on grand scale. Stat Method Med Res 9(4):309–327. https://doi.org/10.1191/096228000701555181

Article   MATH   Google Scholar  

Saeed S, Ali M (2012) Privacy-preserving back-propagation and extreme learning machine algorithms. Data Knowl Eng 79–80:40–61

Singh Y, Bhatia PK, Sangwan OP (2007) A review of studies on machine learning techniques. Int J Comput Sci Secur 1(1):70–84

Yahia ME, El-taher ME (2010) A new approach for evaluation of data mining techniques. Int J Comput Sci Issues 7(5):181–186

Jackson J (2002) Data mining: a conceptual overview. Commun Assoc Inf Syst 8:267–296

Heckerman D (1998) A tutorial on learning with Bayesian networks. Learning in graphical models. Springer, Netherlands, pp 301–354

Politano PM, Walton RO (2017) Statistics & research methodol. Lulu. com

Wetherill GB (1987) Regression analysis with application. Chapman & Hall Ltd, UK

Anderberg MR (2014) Cluster analysis for applications: probability and mathematical statistics: a series of monographs and textbooks, vol 19. Academic Press, USA

Mihoci A (2017) Modelling limit order book volume covariance structures. In: Hokimoto T (ed) Advances in statistical methodologies and their application to real problems. IntechOpen, Croatia. https://doi.org/10.5772/66152

Chapter   Google Scholar  

Thompson B (2004) Exploratory and confirmatory factor analysis: understanding concepts and applications. American Psychological Association, Washington, DC (ISBN:1-59147-093-5)

Kuzey C, Uyar A, Delen (2014) The impact of multinationality on firm value: a comparative analysis of machine learning techniques. Decis Support Syst 59:127–142

Chan Philip K, Salvatore JS (1997) On the accuracy of meta-learning for scalable data mining. J Intell Inf Syst 8:5–28

Tsai Chih-Fong, Hsu Yu-Feng, Lin Chia-Ying, Lin Wei-Yang (2009) Intrusion detection by machine learning: a review. Expert Syst Appl 36:11994–12000

Liao SH, Chu PH, Hsiao PY (2012) Data mining techniques and applications—a decade review from 2000 to 2011. Expert Syst Appl 39:11303–11311

Kanevski M, Parkin R, Pozdnukhov A, Timonin V, Maignan M, Demyanov V, Canu S (2004) Environmental data mining and modelling based on machine learning algorithms and geostatistics. Environ Model Softw 19:845–855

Jain N, Srivastava V (2013) Data mining techniques: a survey paper. Int J Res Eng Technol 2(11):116–119

Baker RSJ (2010) Data mining for education. In: McGaw B, Peterson P, Baker E (eds) International encyclopedia of education, 3rd edn. Elsevier, Oxford, UK

Lew A, Mauch H (2006) Introduction to data mining and its applications. Springer, Berlin

Mukherjee S, Shaw R, Haldar N, Changdar S (2015) A survey of data mining applications and techniques. Int J Comput Sci Inf Technol 6(5):4663–4666

Data mining examples: most common applications of data mining (2019). https://www.softwaretestinghelp.com/data-mining-examples/ . Accessed 27 Dec 2019

Devi SVSG (2013) Applications and trends in data mining. Orient J Comput Sci Technol 6(4):413–419

Data mining—applications & trends. https://www.tutorialspoint.com/data_mining/dm_applications_trends.htm

Keleş MK (2017) An overview: the impact of data mining applications on various sectors. Tech J 11(3):128–132

Top 14 useful applications for data mining. https://bigdata-madesimple.com/14-useful-applications-of-data-mining/ . Accessed 20 Aug 2014

Yang Q, Wu X (2006) 10 challenging problems in data mining research. Int J Inf Technol Decis Making 5(4):597–604

Padhy N, Mishra P, Panigrahi R (2012) A survey of data mining applications and future scope. Int J Comput Sci Eng Inf Technol 2(3):43–58

Gibert K, Sanchez-Marre M, Codina V (2010) Choosing the right data mining technique: classification of methods and intelligent recommendation. In: International Congress on Environment Modelling and Software Modelling for Environment’s Sake, Fifth Biennial Meeting, Ottawa, Canada

Download references

Author information

Authors and affiliations.

University School of Information, Communication and Technology, Guru Gobind Singh Indraprastha University, Sector-16C, Dwarka, Delhi, 110078, India

Manoj Kumar Gupta & Pravin Chandra

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Manoj Kumar Gupta .

Rights and permissions

Reprints and permissions

About this article

Gupta, M.K., Chandra, P. A comprehensive survey of data mining. Int. j. inf. tecnol. 12 , 1243–1257 (2020). https://doi.org/10.1007/s41870-020-00427-7

Download citation

Received : 29 June 2019

Accepted : 20 January 2020

Published : 06 February 2020

Issue Date : December 2020

DOI : https://doi.org/10.1007/s41870-020-00427-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Data mining techniques
  • Data mining tasks
  • Data mining applications
  • Classification
  • Find a journal
  • Publish with us
  • Track your research

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Web Mining: A Survey of Current Research, Techniques, and Software

Profile image of Richard Segall

2008, International Journal of Information Technology & Decision Making

The purpose of this paper is to provide a more current evaluation and update of web mining research and techniques available. Current advances in each of the three different types of web mining are reviewed in the categories of web content mining, web usage mining, and web structure mining. For each tabulated research work, we examine such key issues as web mining process, methods/techniques, applications, data sources, and software used. Unlike previous investigators, we divide web mining processes into the following five subtasks: (1) resource finding and retrieving, (2) information selection and preprocessing, (3) patterns analysis and recognition, (4) validation and interpretation, and (5) visualization. This paper also reports the comparisons and summaries of selected software for web mining. The web mining software selected for discussion and comparison in this paper are SPSS Clementine, Megaputer PolyAnalyst, ClickTracks by web analytics, and QL2 by QL2 Software Inc. Applicat...

Related Papers

anurag kumar

Web Mining is moving the World Wide Web towards a more useful environment in which users can quickly and easily find the information they need. Large amount of text documents, multimedia files and images are available in the web and it is still increasing. Data mining is the form of extracting data’s available in the internet. Web mining is a part of data mining. Web mining is used to discover and extract information from Web-related data sources such as Web documents, Web content, hyperlinks and server logs. The term Web mining has been used in three distinct ways. The first, called Web content mining is the process of information discovery from sources across the World Wide Web. The second, called Web structure mining is the process of analyzing the relationship between Web pages linked by information or direct link connection through the use of graph theory. The third, called Web usage mining is the process of extracting patterns and information from server logs to gain insight on user activity. In this paper, we are trying to give a brief idea regarding web mining concerned with its techniques, tools and applications.

data mining research papers free download

Richard Segall

Venkata Ramana

brijesh singh

Bonfring International Journal

Web is a platforms for information exchange, as it is simple and easy to publish documents. Searching for information becomes a difficult and time-consuming process as the web grows. Web mining uses various data mining techniques to discover useful knowledge from usage log file from the web. The mining tools are used to scan the HTML documents, images, and text, the results is provided for the search engines.It can assist search engines in providing productive results of each search in order of their relevance. In this paper, we brief introduction to the concepts related to web mining and then an overview of different Web usage mining.

Dr. M.A.Dorairangaswamy

This study presents the role of Web mining an explosive growth of the World Wide Web; websites are providing an information and knowledge to the end users. This is the review paper which show deep and intense study of various technologies available for web mining and it is the application of data mining techniques to extract knowledge from web. Current advances in each of the three different types of web mining are reviewed in the categories of web content mining, web usage mining, and web structure mining. Index Terms—web mining, web content mining, web usage mining, web structure mining.

aarti Pandey

Jawad Mughal

Research Publish Journals

Abstract: Web mining is a very hot research topic which combines two of the activated research areas: Data Mining and World Wide Web. The Web mining research relates to several research communities such as Database, Information Retrieval and Artificial Intelligence. Although there exists quite some confusion about the Web mining, the most recognized approach is to categorize Web mining into three areas: Web content mining, Web structure mining, and Web usage mining. Web content mining focuses on the discovery/retrieval of the useful information from the Web contents/data/documents, while the Web structure mining emphasizes to the discovery of how to model the underlying link structures of the Web. The distinction between these two categories isn't a very clear sometimes. Web usage mining is relative independent, but not isolated, category, which mainly describes the techniques that discover the user's usage pattern and try to predict the user's behaviors. This paper is a survey based on the recently published research papers. Besides providing an overall view of Web mining, this paper will focus on Web usage mining. Generally speaking, Web usage mining consists of three phases: Pre-processing, Pattern discovery and Pattern analysis. A detailed description will be given for each part of them, however, special attention will be paid to the user navigation patterns discovery and analysis. The user privacy is another important issue in this paper. An example of a prototypical Web usage mining system, WebSIFT, will be introduced to make it easier to understand the methodology of how to apply data mining techniques to large Web data repositories in order to extract usage patterns. Finally, along with some other interested research issues, a brief overview of the current research work in the area of Web usage mining is included. Title: WEB MINING AN APPLICATION OF DATA MINING Author: Sumit Dalal, Sumit Kumar, Vivek Dixit International Journal of Computer Science and Information Technology Research ISSN 2348-120X (online), ISSN 2348-1196 (print) Research Publish Journals

maithreyan surya

The World Wide Web is a popular and interactive medium to disseminate information today. It is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images, videos, and other multimedia, and navigate between them via hyperlinks. With the recent explosive growth of the amount of content on the Internet, it has become increasingly difficult for users to find and utilize information and for content providers to classify and catalog documents on the World Wide Web. Traditional web search engines often return hundreds or thousands of results for a search, which is time consuming for users to browse. On-line libraries, search engines, and other large document repositories (e.g. customer support databases, product specification databases, press release archives, news story archives, etc.) are growing so rapidly that it is difficult and costly to categorize every document manually. To deal with these problems web mining is used. Web mining is the use of data mining techniques to automatically discover and extract information from the web documents and services. This paper presents an overview of web mining, its methodologies, algorithms and applications.

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

International Journal on Advanced Science, Engineering and Information Technology

azizul azhar ramli (PhD - Soft Computing)

International Journal of Engineering Sciences & Research Technology

Ijesrt Journal

IJESRT Journal

International Journal of Engineering Research and Technology (IJERT)

IJERT Journal

Journal of Computer Science IJCSIS

Srinaganya Gopalrathnam

Editor IJRET

nasrin jokar

Ijetrm Journal

SenthilKumar N

Science and Education Publishing

Shyam Nandan Kumar

Archit Joshi

ACM SIGKDD Explorations Newsletter

International Journal of Computer Science and Informatics

Nagaratna Hegde

balaji narayanaswami

international journal of engineering trends and technology

nitin chopde

WSEAS Transactions on Information Science and …

Jose Aguilar

christy eunaicy

Myra Spiliopoulou

International Journal IJRITCC

International Journal of Computer Applications

Dr Pawan Singh

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • NEWS FEATURE
  • 17 July 2019
  • Correction 19 July 2019

The plan to mine the world’s research papers

  • Priyanka Pulla 0

Priyanka Pulla is a freelance journalist based in Bengaluru, India.

You can also search for this author in PubMed   Google Scholar

Carl Malamud in front of the data store of 73 million articles that he plans to let scientists text mine. Credit: Smita Sharma for Nature

Carl Malamud is on a crusade to liberate information locked up behind paywalls — and his campaigns have scored many victories. He has spent decades publishing copyrighted legal documents, from building codes to court records, and then arguing that such texts represent public-domain law that ought to be available to any citizen online. Sometimes, he has won those arguments in court. Now, the 60-year-old American technologist is turning his sights on a new objective: freeing paywalled scientific literature. And he thinks he has a legal way to do it.

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 51 print issues and online access

185,98 € per year

only 3,65 € per issue

Rent or buy this article

Prices vary by article type

Prices may be subject to local taxes which are calculated during checkout

Nature 571 , 316-318 (2019)

doi: https://doi.org/10.1038/d41586-019-02142-1

Updates & Corrections

Correction 19 July 2019 : An earlier version of this feature used the term ‘fair use’ inappropriately — the term isn’t relevant under Indian law.

Reprints and permissions

Related Articles

data mining research papers free download

Text-mining block prompts online response

Text-mining spat heats up

  • Developing world
  • Computer science

Has your paper been used to train an AI model? Almost certainly

Has your paper been used to train an AI model? Almost certainly

News 14 AUG 24

Estonians gave their DNA to science — now they’re learning their genetic secrets

Estonians gave their DNA to science — now they’re learning their genetic secrets

News 26 JUN 24

Not all ‘open source’ AI models are actually open: here’s a ranking

Not all ‘open source’ AI models are actually open: here’s a ranking

News 19 JUN 24

Chatbots in science: What can ChatGPT do for you?

Chatbots in science: What can ChatGPT do for you?

Career Column 14 AUG 24

Why I’ve removed journal titles from the papers on my CV

Why I’ve removed journal titles from the papers on my CV

Career Column 09 AUG 24

Scientists are falling victim to deepfake AI video scams — here’s how to fight back

Scientists are falling victim to deepfake AI video scams — here’s how to fight back

Career Feature 07 AUG 24

We are junior scientists from emerging economies — the world needs more researchers like us solving global problems

We are junior scientists from emerging economies — the world needs more researchers like us solving global problems

Career Column 26 JUL 24

Even with no drug or vaccine, eradication of Guinea worm is in sight

Even with no drug or vaccine, eradication of Guinea worm is in sight

Outlook 18 JUL 24

Make gene therapies more available by manufacturing them in lower-income nations

Make gene therapies more available by manufacturing them in lower-income nations

Comment 17 JUL 24

Post-doctoral researcher (m/f/d) in (Marine) Microbial Natural Product Chemistry

GEOMAR Helmholtz Centre for Ocean Research Kiel, Wischhofstraße 1-3, 24148 Kiel, Germany

Helmholtz Centre for Ocean Research Kiel (GEOMAR)

data mining research papers free download

Tenure-Track Assistant Professor, Associate Professor, and Professor

Westlake Center for Genome Editing seeks exceptional scholars in the many areas.

Westlake Center for Genome Editing, Westlake University

data mining research papers free download

Tenure-track Assistant Professor

Position Description Brain Immunology and Glia (BIG) Center at the Washington University School of Medicine invites applications for a tenure-track...

Saint Louis, Missouri

Washington University School of Medicine - St. Louis

data mining research papers free download

Post Doctoral Associate Radiation Oncology

The Department of Radiation Oncology at the University of Miami, Miller School of Medicine, has an exciting opportunity for a Postdoctoral Associate.

Miami, Florida

University of Miami/UHealth System

data mining research papers free download

Editor (Applied and Industrial Microbiology)

We’re seeking a biologist who has a critical eye, a deep understanding of their subject and interests beyond, and who can think on their feet.

London, Madrid or Pune (hybrid working)

Springer Nature Ltd

data mining research papers free download

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Welcome to TOP 10 research articles

Top 10 data mining papers: recommended reading ? datamining & knowledgement management research, citation count: 85, data mining and its applications for knowledge management: a literature review from 2007 to 2012.

Tipawan Silwattananusarn 1 and KulthidaTuamsuk 2

1 Ph.D. Student in Information Studies Program, Khon Kaen University, Thailand and 2 Head, Information & Communication Management Program, Khon Kaen University, Thailand

Data mining is one of the most important steps of the knowledge discovery in databases process and is considered as significant subfield in knowledge management. Research in data mining continues growing in business and in learning organization over coming decades. This review paper explores the applications of data mining techniques which have been developed to support knowledge management process. The journal articles indexed in ScienceDirect Database from 2007 to 2012 are analyzed and classified. The discussion on the findings is divided into 4 topics: (i) knowledge resource; (ii) knowledge types and/or knowledge datasets; (iii) data mining tasks; and (iv) data mining techniques and applications used in knowledge management. The article first briefly describes the definition of data mining and data mining functionality. Then the knowledge management rationale and major knowledge management tools integrated in knowledge management cycle are described. Finally, the applications of data mining techniques in the process of knowledge management are summarized and discussed.

Data mining; Data mining applications; Knowledge management

[1] An, X. & Wang, W. (2010). Knowledge management technologies and applications: A literature review . IEEE, 138-141. doi:10.1109/ICAMS.2010.5553046

[2] Berson, A., Smith, S.J. &Thearling, K. (1999). Building Data Mining Applications for CRM. New York: McGraw-Hill .

[3] Cant�, F.J. & Ceballos, H.G. (2010). A multiagent knowledge and information network approach for managing research assets . Expert Systems with Applications, 37(7), 5272-5284.doi:10.1016/j.eswa.2010.01.012

[4] Cheng, H., Lu, Y. & Sheu, C. (2009). An ontology-based business intelligence application in a financial knowledge management system .Expert Systems with Applications, 36, 3614�3622. Doi:10.1016/j.eswa.2008.02.047

[5] Dalkir, K. (2005). Knowledge Management in Theory and Practice . Boston: Butterworth-Heinemann.

[6] Dawei, J. (2011). The Application of Date Mining in Knowledge Management .2011 International Conference on Management of e-Commerce and e-Government, IEEE Computer Society, 7-9. doi:10.1109/ICMeCG.2011.58

[7] Fayyad, U., Piatetsky-Shapiro, G. & Smyth, P. (1996). From Data Mining to Knowledge Discovery in Databases.AI Magazine, 17(3), 37-54.

[8] Gorunescu, F. (2011). Data Mining: Concepts, Models, and Techniques . India: Springer.

[9] Han, J. &Kamber, M. (2012). Data Mining: Concepts and Techniques . 3rd.ed. Boston: Morgan Kaufmann Publishers.

[10] Hwang, H.G., Chang, I.C., Chen, F.J. & Wu, S.Y. (2008). Investigation of the application of KMS for diseases classifications: A study in a Taiwanese hospital . Expert Systems with Applications, 34(1), 725-733. doi:10.1016/j.eswa.2006.10.018

[11] Lavrac, N., Bohanec, M., Pur, A., Cestnik, B., Debeljak, M. &Kobler, A. (2007).Data mining and visualization for decision support and modeling of public health-care resources.Journal of Biomedical Informatics, 40, 438-447. doi:10.1016/j.jbi.2006.10.003

[12] Li, X., Zhu, Z. & Pan, X. (2010). Knowledge cultivating for intelligent decision making in small & middle businesses .Procedia Computer Science, 1(1), 2479-2488. doi:10.1016/j.procs.2010.04.280

[13] Li, Y., Kramer, M.R., Beulens, A.J.M., Van Der Vorst, J.G.A.J. (2010). A framework for early warning and proactive control systems in food supply chain networks. Computers in Industry, 61, 852�862. Doi:101.016/j.compind.2010.07.010

[14] Liao, S.H., Chen, C.M., Wu, C.H. (2008). Mining customer knowledge for product line and brand extension in retailing. Expert Systems with Applications, 34(3), 1763-1776. doi:10.1016/j.eswa.2007.01.036

[15] Liao, S. (2003). Knowledge management technologies and applications-literature review from 1995 to 2002 . Expert Systems with Applications, 25, 155-164. doi:10.1016/S0957-4174(03)00043-5

[16] Liu, D.R. & Lai, C.H. (2011). Mining group-based knowledge flows for sharing task knowledge. Decision Support Systems ,50(2), 370-386. doi:10.1016/j.dss.2010.09.004

[17] Lee, M.R. & Chen, T.T. (2011). Revealing research themes and trends in knowledge management: From 1995 to 2010. Knowledge-Based Systems.doi:10.1016/j.knosys.2011.11.016

[18] McInerney, C.R. & Koenig, M.E. (2011). Knowledge Management (KM) Processes in Organizations: Theoretical Foundations and Practice . USA: Morgan & Claypool Publishers. doi:10.2200/S00323ED1V01Y201012ICR018

[19] McInerney, C. (2002). Knowledge Management and the Dynamic Nature of Knowledge .Journal of the American Society for Information Science and Technology, 53(12), 1009-1018. doi:10.1002/asi.10109

[20] Ngai, E., Xiu, L. &Chau, D. (2009). Application of data mining techniques in customer relationship management: A literature review and classification . Expert Systems with Applications, 36, 2592- 2602. doi:10.1016/j.eswa.2008.02.021

[21] Ruggles, R.L. (ed.). (1997). Knowledge Management Tools. Boston: Butterworth-Heinemann.

[22] Sher, P.J. & Lee, V.C. (2004). Information technology as a facilitator for enhancing dynamic capabilities through knowledge management.Information & Management, 41, 933-945. doi:10.1016/j.im.2003.06.004

[23] Tseng, S.M. (2008). The effects of information technology on knowledge management systems .Expert Systems with Applications, 35, 150-160. doi:10.1016/j.eswa.2007.06.011

[24] Ur-Rahman, N. & Harding, J.A. (2012). Textual data mining for industrial knowledge management and text classification: A business oriented approach . Expert Systems with Applications, 39, 4729-4739. doi:10.1016/j.eswa.2011.09.124

[25] Wang, F. & Fan, H. (2008). Investigation on Technology Systems for Knowledge Management.IEEE, 1-4. doi:10.1109/WiCom.2008.2716

[26] Wang, H. & Wang, S. (2008). A knowledge management approach to data mining process for business intelligence. Industrial Management & Data Systems, 108(5), 622-634.

[27] Wu, W., Lee, Y.T., Tseng, M.L. & Chiang, Y.H. (2010). Data mining for exploring hidden patterns between KM and its performance.Knowledge-Based Systems, 23, 397-401. doi:10.1016/j.knosys.2010.01.014

Citation Count: 83

Analysis of heart diseases dataset using neural network approach.

K. Usha Rani

Dept. of Computer Science, Sri Padmavathi Mahila Visvavidyalayam (Women�s University), Tirupati – 517502 , Andhra Pradesh, India

One of the important techniques of Data mining is Classification. Many real world problems in various fields such as business, science, industry and medicine can be solved by using classification approach. Neural Networks have emerged as an important tool for classification. The advantages of Neural Networks helps for efficient classification of given data. In this study a Heart diseases dataset is analyzed using Neural Network approach. To increase the efficiency of the classification process parallel approach is also adopted in the training phase.

Data mining, Classification, Neural Networks, Parallelism, Heart Disease

[1] John Shafer, Rakesh Agarwal, and Manish Mehta, (1996) � SPRINT:A scalable parallel classifier for data mining �, In Proc. Of the VLDB Conference, Bombay, India..

[2] Sunghwan Sohn and Cihan H. Dagli, (2004) � Ensemble of Evolving Neural Networks in classification �, Neural Processing Letters 19: 191-203, Kulwer Publishers.

[3] K. Anil Jain, Jianchang Mao and K.M. Mohiuddi, (1996) � Artificial Neural Networks: A Tutorial �, IEEE Computers, pp.31-44.

[4] George Cybenk,, (1996)�Neural Networks in Computational Science and Engineering�, IEEE Computational Science and Engineering, pp.36-42

[5] R. Rojas, (1996) �Neural Networks: a systematic introduction�, Springer-Verlag.

[6] R.P.Lippmann,�Pattern classification using neural networks, (1989)� IEEE Commun. Mag., pp.47�64.

[7] Simon Haykin, (2001) � Neural Networks � A Comprehensive Foundation �, Pearson Education.

[8] B.Widrow, D. E. Rumelhard, and M. A. Lehr, (1994) �Neural networks: Applications in industry, business and science,� Commun. ACM, vol. 37, pp.93�105.

[9] W. G. Baxt, (1990) � Use of an artificial neural network for data analysis in clinical decisionmaking: The diagnosis of acute coronary occlusion ,� Neural Comput., vol. 2, pp. 480�489..

[10] Dr. A. Kandaswamy, (1997) �Applications of Artificial Neural Networks in Bio Medical Engineering�, The Institute of Electronics and Telecommunicatio Engineers, Proceedings of the Zonal Seminar on Neural Networks, Nov 20-21.

[11] A. Kusiak, K.H. Kernstine, J.A. Kern, K A. McLaughlin and T.L. Tseng, (2000) � Data mining: Medical and Engineering Case Studies �, Proceedings of the Industrial Engineering Research Conference, Cleveland, Ohio, May21-23,pp.1-7.

[12] H. B. Burke, (1994) � Artificial neural networks for cancer research: Outcome prediction ,� Sem. Surg. Oncol., vol. 10, pp. 73�79.

[13] H. B. Burke, P. H. Goodman, D. B. Rosen, D. E. Henson, J. N. Weinstein, F. E. Harrell, J. R.Marks, D. P. Winchester, and D. G. Bostwick, (1997) � Artificial neural networks improve the accuracy of cancer survival prediction ,� Cancer, vol. 79, pp. 857�8621997.

[14] Siri Krishan Wasan1,Vasudha Bhatnagar2 and Harleen Kaur, (2006)� The impact of Data Mining Techniques on Medical Diagnostics�, Data Science Journal, Volume 5, 119-126.

[15] Scales, R., & Embrechts, M., (2002) �Computational Intelligence Techniques for Medical Diagnostic�, Proceedings of Walter Lincoln Hawkins, Graduate Research Conference from the World Wide Web: http://www.cs.rpi.edu/~bivenj/MRC/proceedings/papers/researchpaper.pdf

[16] S. M. Kamruzzaman , Md. Monirul Islam, (2006)� An Algorithm to Extract Rules from Artificial Neural Networks for Medical Diagnosis Problems�, International Journal of Information Technology, Vol. 12 No. 8.

[17] Hasan Temurtas, Nejat Yumusak, Feyzullah Temurtas, (2009)� A comparative study on diabetes disease diagnosis using neural networks�, Expert Systems with Applications: An International Journal , Volume 36 Issue 4. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.1, No.5, September 2011 8

[18] D Gil, M Johnsson, JM Garcia Chamizo, (2009) , � Application of artificial neural networks in the diagnosis of urological dysfunctions �, Expert Systems with Applications Volume 36, Issue 3, Part 2, Pages 5754-5760, Elsevier

[19] R. Dybowski and V. Gant, (2007), � Clinical Applications of Artificial Neural Networks �, Cambridge University Press.

[20] O. Er, N. Yumusak and F. Temurtas, (2010) “Chest disease diagnosis using artificial neural networks”, Expert Systems with Applications, Vol.37, No.12, pp. 7648-7655.

[21] S. Moein, S. A. Monadjemi and P. Moallem, (2009) “ A Novel Fuzzy-Neural Based Medical Diagnosis System “, International Journal of Biological & Medical Sciences, Vol.4, No.3, pp. 146-150.

Citation Count: 80

Predicting students? performance using id3 and c4.5 classification algorithms.

Kalpesh Adhatrao, Aditya Gaykar, Amiraj Dhawan, Rohit Jha and Vipul Honrao

Department of Computer Engineering, Fr. C.R.I.T., Navi Mumbai, Maharashtra, India

An educational institution needs to have an approximate prior knowledge of enrolled students to predict their performance in future academics. This helps them to identify promising students and also provides them an opportunity to pay attention to and improve those who would probably get lower grades. As a solution, we have developed a system which can predict the performance of students from their previous performances using concepts of data mining techniques under Classification. We have analyzed the data set containing information about students, such as gender, marks scored in the board examinations of classes X and XII, marks and rank in entrance examinations and results in first year of the previous batch of students. By applying the ID3 (Iterative Dichotomiser 3) and C4.5 classification algorithms on this data, we have predicted the general and individual performance of freshly admitted students in future examinations.

Classification, C4.5, Data Mining, Educational Research, ID3, Predicting Performance

[1] Han, J. and Kamber, M., (2006) Data Mining: Concepts and Techniques , Elsevier.

[2] Dunham, M.H., (2003) Data Mining: Introductory and Advanced Topics, Pearson Education Inc.

[3] Kantardzic, M., (2011) Data Mining: Concepts, Models, Methods and Algorithms, Wiley-IEEE Press.

[4] Ming, H., Wenying, N. and Xu, L., (2009) �An improved decision tree classification algorithm based on ID3 and the application in score analysis�, Chinese Control and Decision Conference (CCDC), pp1876-1879.

[5] Xiaoliang, Z., Jian, W., Hongcan Y., and Shangzhuo, W., (2009) � Research and Application of the improved Algorithm C4.5 on Decision Tree �, International Conference on Test and Measurement (ICTM), Vol. 2, pp184-187.

[6] CodeIgnitor User Guide Version 2.14, http://ellislab.com/codeigniter/user-guide/toc.html

[7] RapidMiner, http://rapid-i.com/content/view/181/190/

[8] MySQL � The world�s most popular open source database, http://www.mysql.com/

Citation Count: 51

Diagnosis of diabetes using classification mining techniques.

Aiswarya Iyer, S. Jeyalatha and Ronak Sumbaly

Department of Computer Science, BITS Pilani Dubai, United Arab Emirates

Diabetes has affected over 246 million people worldwide with a majority of them being women. According to the WHO report, by 2025 this number is expected to rise to over 380 million. The disease has been named the fifth deadliest disease in the United States with no imminent cure in sight. With the rise of information technology and its ontinued advent into the medical and healthcare sector, the cases of diabetes as well as their symptoms are well documented. This paper aims at finding solutions to diagnose the disease by analyzing the patterns found in the data through classification analysis by employing Decision Tree and Na�ve Bayes algorithms. The research hopes to propose a quicker and more efficient technique of diagnosing the disease, leading to timely treatment of the patients.

Classification, Data Mining, Decision Tree, Diabetes and Na�ve Bayes.

[1] National Diabetes Information Clearinghouse (NDIC), http://diabetes.niddk.nih.gov/dm/pubs/type1and2/#signs

[2] Global Diabetes Community, http://www.diabetes.co.uk/diabetes_care/blood-sugar-level-ranges.html

[3] Jiawei Han and Micheline Kamber, “Data Mining Concepts and Techniques�, Morgan Kauffman Publishers, 2001

[4] S. Kumari and A. Singh, � A Data Mining Approach for the Diagnosis of Diabetes Mellitus �, Proceedings of Seventh lnternational Conference on Intelligent Systems and Control, 2013, pp. 373-375

[5] C. M. Velu and K. R. Kashwan, �Visual Data Mining Techniques for Classification of Diabetic Patients�, 3rd IEEE International Advance Computing Conference (IACC), 2013

[6] Sankaranarayanan.S and Dr Pramananda Perumal.T, � Predictive Approach for Diabetes Mellitus Disease through Data Mining Technologies �, World Congress on Computing and Communication Technologies, 2014, pp. 231-233

[7] Mostafa Fathi Ganji and Mohammad Saniee Abadeh, �Using fuzzy Ant Colony Optimization for Diagnosis of Diabetes Disease�, Proceedings of ICEE 2010, May 11-13, 2010

[8] T.Jayalakshmi and Dr.A.Santhakumaran, � A Novel Classification Method for Diagnosis of Diabetes Mellitus Using Artificial Neural Networks �, International Conference on Data Storage and Data Engineering, 2010, pp. 159-163

[9] Sonu Kumari and Archana Singh, �A Data Mining Approach for the Diagnosis of Diabetes Mellitus�, Proceedings of71hlnternational Conference on Intelligent Systems and Control (ISCO 2013)

[10] Neeraj Bhargava, Girja Sharma, Ritu Bhargava and Manish Mathuria, Decision Tree Analysis on J48 Algorithm for Data Mining. Proceedings of International Journal of Advanced Research in Computer Science and Software Engineering, Volume 3, Issue 6, June 2013.

[11] Michael Feld, Dr. Michael Kipp, Dr. Alassane Ndiaye and Dr. Dominik Heckmann �Weka: Practical machine learning tools and techniques with Java implementations�

[12] White, A.P., Liu, W.Z.: Technical note: Bias in information-based measures in decision tree induction . Machine Learning 15(3), 321�329 (1994)

Citation Count: 42

A new clutering approach for anomaly intrusion detection.

Ravi Ranjan and G. Sahoo

Department of Information Technology, Birla Institute of Technology, Mesra, Ranchi

Recent advances in technology have made our work easier compare to earlier times. Computer network is growing day by day but while discussing about the security of computers and networks it has always been a major concerns for organizations varying from smaller to larger enterprises. It is true that organizations are aware of the possible threats and attacks so they always prepare for the safer side but due to some loopholes attackers are able to make attacks. Intrusion detection is one of the major fields of research and researchers are trying to find new algorithms for detecting intrusions. Clustering techniques of data mining is an interested area of research for detecting possible intrusions and attacks. This paper presents a new clustering approach for anomaly intrusion detection by using the approach of K-medoids method of clustering and its certain modifications. The proposed algorithm is able to achieve high detection rate and overcomes the disadvantages of K-means algorithm.

Clustering, data mining, intrusion detection, network security

[1] J. Anderson, � Computer security threat monitoring and surveillance �, 1980.

[2] Dorothy E. Denning, �An intrusion-detection model�, IEEE Transactions on software engineering, pp. 222�232, 1987.

[3] Kemmerer, R., and Vigna, G. �Intrusion Detection: A Brief History and Overview.� IEEE Security & Privacy, v1 n1, Apr 2002, p27-30.

[4] S. Staniford-Chen, S. Cheung, R. Crawford., M. Dilger, J. Frank, J. Hoagland, K. Levitt, C.Wee, R.Yip, D. Zerkle . � GrIDS- A Graph-Based Intrusion Detection system for Large Networks .� Proc National Information Systems Security conf, 1996.

[5] M.Jianliang, S.Haikun and B.Ling. The Application on Intrusion Detection based on K- Means Cluster Algorithm . International Forum on Information Technology and Application, 2009.

[6] Yu Guan, Ali A. Ghorbani and Nabil Belacel. Y-means: a clustering method for Intrusion Detection. In Canadian Conference on Electrical and Computer Engineering, pages 14, Montral, Qubec, Canada, May 2003.

[7] Zhou Mingqiang, HuangHui, WangQian, � A Graph-based Clustering Algorithm for Anomaly Intrusion Detection � In computer science and education (ICCSE), 7th International Conference ,2012.

[8] Chitrakar, R. and Huang Chuanhe, � Anomaly detection using Support Vector Machine Classification with K-Medoids clustering � In Internet (AH-ICI), 3rd Asian Himalayas International conference, 2012.

[9] Yang Jian, �An Improved Intrusion Detection Algorithm Based on DBSCAN�, Micro Computer Information, 25,1008-0570(2009)01- 3- 0058-03, 58-60,2009.

[10] Li Xue-yong, Gao Guo- �A New Intrusion Detection Method Based on Improved DBSCAN �, In Information Engineering (ICIE), WASE International conference, 2010.

[11] Lei Li, De-Zhang, Fang-Cheng Shen, � A novel rule-based Intrusion Detection System using data mining �, In ICCSIT, IEEE International conference, 2010.

[12] Z. Muda, W. Yassin, M.N. Sulaiman and N.I.Udzir, � Intrusion Detection based on K-Means Clustering and OneR Classification � In Information Assurance and Security (IAS), 7th International conference, 2011.

[13] Zhengjie Li, Yongzhong Li, Lei Xu, � Anomaly intrusion detection method based on K-means clustering algorithm with particle swarm optimization �, In ICM, 2011.

[14] Kapil Wankhade, Sadia Patka, Ravindra Thool, � An Overview of Intrusion Detection Based on Data Mining Techniques �, In Proceedings of 2013 International Conference on Communication Systems and Network Technologies, IEEE, 2013, pp.626-629. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.4, No.2, March 2014 38

[15] H. Fatma, L. Mohamed, �A two-stage technique to improve intrusion detection systems based on data mining algorithms�, In ICMSAO, 2013.

[16] A.M. Chandrasekhar, K. Raghuveer, � Intrusion detection technique by using K-means,fuzzy neural network and SVM classifiers �, In ICCCI, 2013.

[17] Margaret H. Dunham, � Data Mining: Introductory and Advanced Topics �,ISBN: 0130888923, published by Pearson Education, Inc.,2003.

[18] KDD.KDDCup1999Data. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html , 1999.

Citation Count: 34

Incremental learning: areas and methods ?a survey.

Prachi Joshi 1 and Parag Kulkarni 2

1 Assistant Professor, MIT College of Engineering, Pune and 2 Adjunct Professor, College of Engineering, Pune

While the areas of applications in data mining are growing substantially, it has become extremely necessary for incremental learning methods to move a step ahead. The tremendous growth of unlabeled data has made incremental learning take up a big leap. Starting from BI applications to image classifications, from analysis to predictions, every domain needs to learn and update. Incremental learning allows to explore new areas at the same time performs knowledge amassing. In this paper we discuss the areas and methods of incremental learning currently taking place and highlight its potentials in aspect of decision making. The paper essentially gives an overview of the current research that will provide a background for the students and research scholars about the topic.

Incremental, learning, mining, supervised, unsupervised, decision-making

[1] Y. Lui, J. Cai, J. Yin, A. Fu, Clustering text data streams , Journal of Computer Science and Technology, 2008, pp 112-128.

[2] A. Fahim, G. Saake, A. Salem, F. Torky, M. Ramadan, K-means for spherical clusters with large variance in sizes , Journal of World Academy of Science, Engineering and Technology, 2008.

[3] F. Camastra, A. Verri, A novel kernel method for clustering, IEEE Transactions on Pattern Analysis and Machince Intelligence, Vol. 27, no.5, 2005, pp 801-805.

[4] F. Shen, H. Yu, Y. Kamiya, O. Hasegawa, An Online Incremental Semi-Supervised Learning Method , Journal of advanced Computational Intelligence and Intelligent Informatics, Vol. 14, No.6, 2010.

[5] T. Zhang, R. Ramakrishnan, M. Livny, Birch: An efficient data clustering method for very large databases, Proc. ACM SIGMOD Intl.Conference on Management of Data , 1996, pp.103-114.

[6] S. Deelers, S. Auwantanamongkol, Enhancing k-means algorithm with initial cluster centers derived from data partitioning along the data axis with highest variance , International Journal of Electrical and Computer Science, 2007, pp 247-252.

[7] S. Young, A. Arel, T. Karnowski, D. Rose, A Fast and Stable Incremental Clustering Algorithm , Proc. of International Conference on Information Technology New Generations, 2010, pp 204-209.

[8] M. Charikar, C. Chekuri, T. Feder, R. Motwani, Incremental clustering and dynamic information retrival, Proc. of ACM symposium on Theory of Computeion , 1997, pp 626- 635.

[9] K. Hammouda, Incremental document clustering using Cluster similarity histograms , Proc. of IEEE International Conference on Web Intelligence, 2003, pp 597- 601.

[10] X. Su, Y. Lan,R. Wan, Y. Qin, A fast incremental clustering algorithm , Proc. of International Symposium on Information Processing, 2009, pp 175-178.

[11] T. Li, HIREL: An incremental clustering for relational data sets , Proc. of IEEE International Conference on Data Mining, 2008, pp 887 � 892.

[12] P. Lin, Z. Lin, B. Kuang, P. Huang, A Short Chinese Text Incremental Clustering Algorithm Based on Weighted Semantics and Naive Bayes , Journal of Computational Information Systems, 2012, pp 4257- 4268.

[13] C. Chen, S. Hwang, Y. Oyang, An Incremental hierarchical data clustering method based on gravity theory , Proc. of PAKDD, 2002, pp 237-250.

[14] M. Ester, H. Kriegel, J. Sander, M. Wimmer, X. Xu, Incremental Clustering for Mining in a Data Warehousing Environment , Proc. of Intl. Conference on very large data bases, 1998, pp 323-333.

[15] G. Shaw, Y. Xu, Enhancing an incremental clustering algorithm for web page collections, Proc. of IEEE/ACM/WIC Joint Conference on Web Intelligence and and Intelligent Agent Technology, 2009.

[16] C. Hsu, Y. Huang, Incremental clustering of mixed data based on distance hierarchy , Journal of Expert systems and Applications, 35, 2008, pp 1177 � 1185.

[17] S. Asharaf, M. Murty, S. Shevade, Rough set based incremental clustering of interval data, Pattern Recognition Letters, Vol.27 (9), 2006, pp 515-519.

[18] Z. Li, Incremental Clustering of trajectories , Computer and Information Science, Springer 2010, pp 32-46.

[19] S. Elnekava, M. Last, O. Maimon, I ncremental clustering of mobile objects , Proc. of IEEE International Conference on Data Engineering, 2007, pp 585-592.

[20] S. Furao, A. Sudo, O. Hasegawa, An online incremental learning pattern -based reasoning system, Journal of Neural Networks, Elsevier, Vol. 23,(1), 2010.pp 135-143.

[21] S. Ferilli, M. Biba, T.Basile, F. Esposito, Incremental Machine learning techniques for document layout understanding , Proc. of IEEE Conference on Pattern Recognition, 2008, pp 1-4.

[22] S. Ozawa, S. Pang, N. Kasabov, Incremental Learning of chunk data for online pattern classification systems, IEEE Transactions on Neural Networks, Vo. 19 (6), 2008, pp 1061-1074.

[23] Z. Chen, L. Huang, Y. Murphey, Incremental learning for text document classification , Proc. of IEEE Conference on Neural Networks, 2007, pp 2592-2597. 51

[24] R. Polikar, L. Upda, S. Upda, V. Honavar, Learn ++: An incremental learning algorithm for supervised neural networks , IEEE Transactions on Systems, Man and Cybernatics, Vol.31 (4), 2001, pp 497-508.

[25] H. He, S. Chen, K. Li, X. Xu, Incremental learning from stream data, IEEE Transactions on Neural Networks , Vol.22(12), 2011, pp 1901-1914.

[26] A. Bouchachia, M. Prosseger, H. Duman, Semi supervised incremental learning, Proc. of IEEE International Conference on Fuzzy Systems, 2010 pp 1-7.

[27] R. Zhang, A. Rudnicky, A new data section principle for semi-supervised incremental learning , Computer Science department, paper 1374, 2006, http://repository.cmu.edu/compsci/1373 .

[28] Z. Li, S. Watchsmuch, J. Fritsch, G. Sagerer, Semi-supervised incremental learning of manipulative tasks, Proc. of International Conference on Machine Vision Applications, 2007, pp 73-77.

[29] A. Misra, A. Sowmya, P. Compton, Incremental learning for segmentation in medical images , Proc. of IEEE Conference on Biomedical Imaging, 2006.

[30] P. Kranen, E. Muller, I. Assent, R. Krieder, T. Seidl, Incremental Learning of Medical Data for MultiStep Patient Health Classification, Database technology for life sciences and medicine, 2010.

[31] J. Wu, B. Zhang, X. Hua, J, Zhang, A semi-supervised incremental learning framework for sports video view classification, Proc. of IEEE Conference on Multi-Media Modelling, 2006.

[32] S. Wenzel, W. Forstner, Semi supervised incremental learning of hierarchical appearance models , The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol.37,2008.

[33] S. Ozawa, S. Toh, S. Abe, S. Pang, N. Kasabov, Incremental Learning for online face recognition , Proc. of IEEE Conference on Neural Networks, Vol. 5, 2005 pp 3174-3179.

[34] Z. Erdem, R. Polikar, F. Gurgen, N. Yumusak, Ensemble of SVMs for Incremental Learning , Multiple Classifier Systems, Springer Verlang,, 2005, pp 246-256.

[35] X. Yang, B. Yuan, W. Liu, Dynamic Weighting ensembles for incremental learning , Proc. of IEEE conference in pattern recognition. 2009, pp 1-5.

[36] R. Elwell, R. Polikar, Incremental Learning of Concept drift in nonstationary environments, IEEE Transactions on Neural Networks, Vol.22 (10), 2011 pp 1517- 1531.

[37] W. Khreich, E. Granger, A. Miri, R. Sabourin, A survey of techniques for incremental learning of HMM parameters , Journal of Information Science, Elsevier, 2012.

[38] O. Buffet, A. Duetch, F. Charpillet, Incremental Reinforcement Learning for designing multi-agent systems , Proc. of ACM International Conference on Autonomous Agents, 2001.

[39] E. Demidova, X. Zhou, W. Nejdl, A probabilistic scheme for keyword-based incremental query construction, IEEE Transactions on Knowledge and Data Engineering, 2012, pp 426-439.

[40] R. Roscher, W. Forestner, B. Waske, I2VM: Incremental import vector machines , Journal of Image and Vision Computing, Elsevier, 2012.

Citation Count: 33

A prototype decision support system for optimizing the effectiveness of elearning in educational institutions.

S. Abu-Naser, A. Al-Masri, Y. Abu Sultan and I. Zaqout

Al Azhar University Gaza, Palestine,

In this paper, a prototype of a Decision Support System (DSS) is proposed for providing the knowledge for optimizing the newly adopted e-learning education strategy in educational institutions. If an educational institution adopted e-learning as a new strategy, it should undertake a preliminary evaluation to determine the percentage of success and areas of weakness of this strategy. If this evaluation is done manually, it would not be an easy task to do and would not provide knowledge about all pitfall symptoms. The proposed DSS is based on exploration (mining) of knowledge from large amounts of data yielded from the operating the institution to its business. This knowledge can be used to guide and optimize any new business strategy implemented by the institution. The proposed DSS involves Database engine, Data Mining engine and Artificial Intelligence engine. All these engines work together in order to extract the knowledge necessary to improve the effectiveness of any strategy, including e-learning

DSS, E-learning, knowledge, Database, Data mining, Artificial Intelligence.

[1] Power, D.J., (2002) Decision Support Systems: Concepts and Resources for Managers . Quorum Books/Greenwood Publishing.

[2] Han, J. and M. Kamberm (2006). Data mining: concepts and techniques. Amsterdam ; Boston San Francisco, CA, Elsevier; Morgan Kaufmann.

[3] Clark, R. C., & Mayer, R. E., (2003). e-Learning and the Science of Instruction: Proven Guidelines for Consumers and Designers of Multimedia Learning. San Francisco: Jossey-Bass.

[4] Kamber, M., Winstone, L., Gong, W., Cheng, S. and Han, J. (1997). Generalization and decision tree induction: efficient classification in data mining . In 7th International Workshop on Research Issues in Data Engineering (RIDE ’97) High Performance Database Management for Large-Scale Applications, pp.111.

[5] Agrawal, R., Imielinski,T. and Swami, A., (1993), M ining association rules between sets of items in large databases In Prooc. of the ACM SIGMOD Int’l Conf. on Management of Data (ACM SIGMOD ’93), Washington, USA.

[6] MERCERON, A. and YACEF, K,. (2005). Educational Data Mining: a Case Study. In Artificial Intelligence in Education (AIED2005), C.-K. LOOI, G. MCCALLA, B.

[7] Russell S., Peter Norvig, P., (2010), Artificial intelligence: a modern approach, 3rd edition, Prentice Hall.

[8] Power, D.J., A Brief History of Decision Support Systems, DSSResources.COM, World-Wide Web, (2011), http://dssresources.com/history/dsshistory.html , version 2.6

[9] Sanjeev, P. and Zytkow, J.M., (1995). Discovering enrollment knowledge in university databases . In KDD, pp. 246-251.

[10]Luan, J., (2002).Data mining, knowledge management in higher education, potential applications. In workshop associate of institutional research international conference, Toronto, pp. 1- 18.

[11]Deniz, D.Z. and Ersan, I., (2001) Using an academic DSS for student, course and program assessment, International Conference on Engineering Education, Oslo, pp.6B8-12�6B8 17.

[12]Deniz, D.Z. and Ersan, I., (2002). An academic decision-support system based on academic performance evaluation for student and program assessment , International Journal of Engineering Education, Vol. 18, No. 2, pp.236�244.

[13]Minaei-Bidgli, B. and Punch,W.,(2003). Using genetic algorithms for data mining optimizing in an educational web-based system . In GECCO, pp. 2252-2263.

[14]Dasgupta, P. and Khazanchi, D., (2005). Adaptive decision support for academic course scheduling using intelligent software agents. International Journal of Technology in Teaching and Learning, Vol. 1, No 2,pp., 63-78.

[15]Mansmann, S. and Scholl, M. H., (2007 ). Decision Support System for Managing Educational Capacity Utilization in Education , IEEE Transactions Vol. 50, No. 2, pp. 143 � 150.

[16]Inmon, W.H. and Kelley, C., (1993). Rdb/VMS: Developing the Data Warehouse. QED Publishing Group, Boston.

[17]Agrawal, R., Gupta, A., and Sarawagi, S., (1995). Modeling multidimensional databases . IBM Research Report.

[18]Han, J.; Cercone, N. and Cai, Y., (1991). Attribute-Oriented Induction in Relational Databases In G. Piatetsky-Shapiro and W. J. Frawley, editors, Knowledge Discovery in Databases, pp. 213-228.

[19]Lauden, K. and Lauden J., (2009). Management information Systems. Prentice Hall ; 11th edition.

[20]Nwelih, E. and Chiemeke, S.C. (2010) Academic Advising Decision Support System for Nigerian Universities, Anthology of Abstracts of the 3rd International Conference on ICT for Africa, March 25-27, Yaound�, Cameroon. Baton Rouge, LA: International Center for IT and Development.

[21]Marta Zorrilla, Diego Garc�a and Elena �lvarez.(2010). A Decision Support System to improve eLearning Environments. BEWEB 2010 – International Workshop on Business intelligence and the WEB ,March 22-26, 2010 – Lausanne (Switzerland).

[22]Roberto Llorente and Maria Morant, (2011), Data Mining in Higher Education, Kimito Funatsu, InTech, 2011.

[23]Falakmasir M., and Habibi J., (2010), Using Educational Data Mining Methods to Study the Impact of Virtual Classroom in E-Learning, Educational Data Mining 2010, 3rd International Conference on Educational Data Mining , Pittsburgh, PA, USA, June 11-13, 2010.

[24]Rajibussalim M., (2010), Mining Students� Interaction Data from a System that Support Learning by Reflection, Educational Data Mining 2010, 3rd International Conference on Educational Data Mining , Pittsburgh, PA, USA, June 11-13, 2010.

[25]Kumar R. and Chadrasekaran R.,(2011), Attribute Correction – Data Cleaning Using Association Rule and Clustering Methods , International Journal of Data Mining & Knowledge Management Process (IJDKP). Vol(1),No(2).

[26]Srinivas K., Raghavendra G. and Govardhan A., (2011), Survey on Prediction of Heart Morbidity Using Data Mining Techniques

Citation Count: 27

Experimental study of data clustering using k-means and modified algorithms.

M.P.S Bhatia and Deepika Khurana

University of Delhi, New Delhi, India

The k- Means clustering algorithm is an old algorithm that has been intensely researched owing to its ease and simplicity of implementation. Clustering algorithm has a broad attraction and usefulness in exploratory data analysis. This paper presents results of the experimental study of different approaches to k- Means clustering, thereby comparing results on different datasets using Original k-Means and other modified algorithms implemented using MATLAB R2009b. The results are calculated on some performance measures such as no. of iterations, no. of points misclassified, accuracy, Silhouette validity index and execution time.

Data Mining, Clustering Algorithm, k- Means, Silhouette Validity Index.

[1] Ran Vijay Singh and M.P.S Bhatia , � Data Clustering with Modified K-means Algorithm �, IEEE International Conference on Recent Trends in Information Technology, ICRTIT 2011, pp 717-721.

[2] D. Napoleon and P. Ganga lakshmi, � An Efficient K-Means Clustering Algorithm for Reducing Time Complexity using Uniform Distribution Data Points �, IEEE 2010.

[3] Tajunisha and Saravanan, � Performance Analysis of k-means with different initialization methods for high dimensional data � International Journal of Artificial Intelligence & Applications (IJAIA), Vol.1, No.4, October 2010

[4] Neha Aggarwal and Kriti Aggarwal,� A Mid- point based k �mean Clustering Algorithm for Data Mining �. International Journal on Computer Science and Engineering (IJCSE) 2012.

[5] Barile� Barisi Baridam,� More work on k-means Clustering algortithm: The Dimensionality Problem �. International Journal of Computer Applications (0975 � 8887)Volume 44� No.2, April 2012.

[6] Shi Na, Li Xumin, Guan Yong �Research on K-means clustering algorithm�. Proc of Third International symposium on Intelligent Information Technology and Security Informatics, IEEE 2010.

[7] Ahamad Shafeeq and Hareesha � Dynamic clustering of data with modified K-mean algorithm �, Proc. International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) � (2012) IACSIT Press, Singapore 2012.

[8] Kohei Arai,Ali Ridho Barakbah, Hierarchical K-means: an algorithm for centroids initialization for K-means.

[9] Data Mining Concepts and Techniques,Second edition Jiawei Han and Micheline Kamber.

[10] �Towards more accurate clustering method by using dynamic time warping� International Journal of Data Mining and Knowledge Management Process (IJDKP), Vol.3, No.2,March 2013.

[11] C. S. Li, � Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters �, �2011 International Conference on Advances in Engineering, Elsevier�, pp. 324-328, vol.24, 2011.

[12] A Review of Data Clustering Approaches Vaishali Aggarwal, Anil Kumar Ahlawat, B.N Panday. ISSN: 2277-3754 International Journal of Engineering and Innovative Technology (IJEIT) Volume 1, Issue 4, April 2012.

[13] Ali Alijamaat, Madjid Khalilian, and Norwati Mustapha, � A Novel Approach for High Dimensional Data Clustering � 2010 Third International Conference on Knowledge Discovery and Data Mining.

[14] Zhong Wei, et al. “ Improved K-Means Clustering Algorithm for Exploring Local Protein Sequence Motifs Representing Common Structural Property ” IEEE Transactions on Nanobioscience, Vol.4., No.3. Sep. 2005. 255-265.

[15] K.A.Abdul Nazeer, M.P.Sebastian, �I mproving the Accuracy and Efficiency of the k-means Clustering Algorithm �,Proceeding of the World Congress on Engineering, vol 1,london, July 2009.

[16] Mu-Chun Su and Chien-Hsing Chou � A Modified version of k-means Algorithm with a Distance Based on Cluster Symmetry �.IEEE Transactions On Pattern Analysis and Machine Intelligence, Vol 23 No. 6 ,June 2001.

Citation Count: 26

Data, text and web mining for business intelligence : a survey.

Abdul-Aziz Rashid Al-Azmi

Department of Computer Engineering, Kuwait University, Kuwait

The Information and Communication Technologies revolution brought a digital world with huge amounts of data available. Enterprises use mining technologies to search vast amounts of data for vital insight and knowledge. Mining tools such as data mining, text mining, and web mining are used to find hidden knowledge in large databases or the Internet. Mining tools are automated software tools used to achieve business intelligence by finding hidden relations, and predicting future events from vast amounts of data. This uncovered knowledge helps in gaining completive advantages, better customers� relationships, and even fraud detection. In this survey, we�ll describe how these techniques work, how they are implemented. Furthermore, we shall discuss how business intelligence is achieved using these mining tools. Then look into some case studies of success stories using mining tools. Finally, we shall demonstrate some of the main challenges to the mining technologies that limit their potential

[1] Bill Palace, (1996) �Technology Note prepared for Management 274A� Anderson Graduate School of Management at UCLA.

[2] Trevor Hastie, Robert Tibshirani, and Jerome Friedman, (2008) � The Elements of Statistical Learning: Data Mining, Inference and Prediction ,� New York, Springer-Verlag, ISBN 0 387 95284-5

[3] Doug Alexander, (2011) �Data Mining�, [email protected]

[4] Michael Goebel, Le Gruenwald, (1999) � A Survey Of Data Mining And Knowledge Discovery Software Tools ,� SIGKDD Explorations, Vol. 1, Issue 1. Pg 20, ACM SIGKDD.

[5] Chidanand Apte, Bing Liu, Edwin P.D. Pednault, Padhraic Smyth, (2002) �Business Applications of Data Mining,� Communications of the ACM, Vol. 45, No. 8.

[6] Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth, (1996) � From Data Mining to Knowledge Discovery in Databases ,� AI Magazine, American Association for Artificial Intelligence AAAI, Vol. 17 No. 3.

[7] Marti Hearst, (2003) �What Is Text Mining?� SIMS, UC Berkeley.

[8] Prof. Anita Wasilewska, (2011) �Web Mining Presentation 1� CSE 590 Data Mining, Stony Brook.

[9] Prasanna Desikan, Colin DeLong, Sandeep Mane, Kalyan Beemanapalli, Kuo-Wei Hsu, Prasad Sriram, Jaideep Srivastava, Vamsee Venuturumilli, (2009) �Web Mining for Business Computing� Handbooks in Information Systems v.3, Emerald Group Publishing Limited.

[10] MineIT (2010) �Web Mining, The E-Tailers� Holy Grail?� http://www.mineit.com

[11] Maria C. Ferreira de Oliveira and H. Levkowitz, (2003) � From Visual Data Exploration to Visual Data Mining: A Survey � IEEE Transactions on Visualization and Computer Graphics, Vol. 9, No. 3.

[12] E.H. Chi, (2000) �A Taxonomy of Visualization Techniques Using the Data State Reference Model,� In the Proceedings of the Information Visualization Symposium InfoVis 2000, pp. 69-75.

[13] A. Hotho, A. Nu�rnberger, G. Paa�, (2005) �A Brief Survey of Text Mining� GLDV-Journal for Computational Linguistics and Language Technologies.

[14] The Cross Industry Standard Process for Data Mining Blog (2008).

[15] Feldman, R. & Dagan, I. (1995) �Knowledge discovery in texts� In Proceeding of the First International Conference on Knowledge Discovery (KDD), pp. 112�117.

[16] Michele Fattori, Giorgio Pedrazzi, Roberta Turra, (2003) �Text mining applied to patent mapping: a practical business case� World Patent Information, Volume 25, Issue 4.

[17] Ajith Abraham, (2003) � Business Intelligence from Web Usage Mining � Journal of Information & Knowledge Management, Vol. 2, No. 4, iKMS & World Scientific Publishing Co.

[18] Vishal Gupta, Gurpreet S. Lehal, (2009) � A Survey of Text Mining Techniques and Applications � Journal of Emerging Technologies in Web Intelligence, Vol. 1, No. 1.

[19] W. H. Inmon, (1996) �The Data Warehouse and Data Mining� Communications of the ACM, Vol. 39, No. 11, ACM.

[20] Rajender Singh Chhillar, (2008) �Extraction Transformation Loading, A Road to Data Warehouse,� Second National Conference Mathematical Techniques: Emerging Paradigms for Electronics and IT Industries, India, pp. 384-388.

[20] Samia Jones, Omprakash K. Gupta, 2006) �Web Data Mining: A Case Study� Communications of the IIMA, Vol. 6, Issue 4.

[21] J.R. Quinlan, (1986) �Induction of Decision Trees�, Machine Learning, Kluwer Academic Publishers, Boston.

[22] Cohen KB, Hunter L, (2008) � Getting Started in Text Minin g� PLoS Comput Biol.

[23] Judy Redfearn and the JISC Communications team, (2006) �What Text Mining can do� Briefing paper, �Joint Information Systems Committee� JISC.

[24] Neto, J., Santos, A., Kaestner, C., Freitas, A. 2000) � Document Clustering and Text Summarization � In the Proceeding of the 4th International Conference Practical Applications of Knowledge Discovery and Data Mining PADD-2000, London, UK.

[25] R. Kosla and H. Blockeel, (2000)� Web mining research a survey ,� SIGKDD Explorations, vol. 2, pp.1�15.

[26] Sankar K. Pal, Varun Talwar, Pabitra Mitra, (2002) � Web Mining in Soft Computing Framework: Relevance, State of the Art and Future Directions � IEEE Transactions on Neural Networks, Vol. 13,No. 5.

[27] Ralf Mikut, and Markus Reischl, (2011) �Data mining tools� Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 1, Issue 5.

[28] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten (2009) � The WEKA data mining software: an update � SIGKDD Explorer News.

[29] Dorronsoro, J., Ginel, F., Sanchez, C. & Cruz, C. (1997) �Neural Fraud Detection in Credit Card Operations� IEEE Transactions on Neural Networks.

[30] Clifton Phua, Vincent Lee, Kate Smith, Ross Gayler, (2010) � A Comprehensive Survey of Data Mining-based Fraud Detection Research � Cornell University library, CoRR.

[31] Sang Jun Lee, Keng Siau, (2001) � A Review of Data Mining Techniques � Industrial Management and Data Systems, 101/1, MCB University Press.

[32] IBM, SurfAid Analytics (2003).

[33] Federico Michele Facca, Pier Luca Lanzi, (2005) �Mining interesting knowledge from weblogs: a survey� Data & Knowledge Engineering, 53, Elsevier.

[34] Mu-Chen Chen, Cheng-Lung Huang, Kai-Ying Chen, Hsiao-Pin Wu, (2005) � Aggregation of Orders in Distribution Centers using Data Mining � Expert Systems with Applications, Volume 28, Issue 3, Pages 453-460, Elsevier.

[35] Van den Berg, J. P. (1999) �A literature survey on planning and control of warehousing systems� IIE Transactions, 31, PP.751�762.

[36] Fitzsimons, M., Khabaza, T., and Shearer, C. (1993) �The Application of Rule Induction and Neural Networks for Television Audience Prediction� In Proceedings of ESOMAR/EMAC/AFM Symposium on Information Based Decision Making in Marketing, Paris, pp 69-82.

[37] Gregory Piatetsky-Shapiro, Ron Brachman, Tom Khabaza (1996) � An Overview of Issues in Developing Industrial Data Mining and Knowledge Discovery Applications � KDD-96 Proceedings.

[38] Amir F. Atiya, (2001) �Bankruptcy Prediction for Credit Risk Using Neural Networks: A Survey and New Results� IEEE Transactions on Neural Networks, vol. 12, no. 4.

[39] M. Crouhy, D. Galai, and R. Mark, (2000) � A comparative analysis of current credit risk models ,� J. Banking & Finance, vol. 24, pp. 59�117.

[40] Marinela Mircea, Bogdan Ghilic-Micu, Marian Stoica, (2007) � Combining Business Intelligence with Cloud Computing to Delivery Agility in Actual Economy � Department of Economic Informatics The Bucharest Academy of Economic Studies.

[41] Thiagarajan Ramakrishnan, Mary C. Jones, Anna Sidorova, (2011) � Factors Influencing Business Intelligence and Data Collection Strategies: An empirical investigation �, Decision Support Systems.

[42] Surajit Chaudhuri, Vivek Narasayya, (2011) � New Frontiers in Business Intelligence � The 37th International Conference on Very Large Data Bases, Seattle, Washington, Vol. 4, No. 12, VLDB.

[43] Consumer Packaged Goods Company Multi-Model Study, (1998) �Data Mining Case Study: Retail�. [44] IBM Software Group Case Study. (2010) �Great Canadian Gaming Corporation Leverages IBM Cognos 8: Solutions for Financial Consolidation and Reporting Standardization�.

[45] A. Vellidoa, P.J.G. Lisboaa, J. Vaughan, (1999) � Neural Networks in Business: a Survey of Applications (1992�1998) � Expert Systems with Applications 17, pp. 51�70, Elsevier Science.

[46] Injazz J. Chen, K. Popovich, (2003) “Understanding Customer Relationship Management (CRM): People, process and technology”, Business Process Management Journal, Vol. 9, pp.672 � 688.

[47] Dave Smith (2010) � Using Data and Text Mining to Drive Innovation � PhUSE 2010, UK.

[48] Dien D. Phan, Douglas R. Vogel, (2010) �A Model of Customer Relationship Management and Business Intelligence Systems for Catalogue and Online Retailers�, Information & Management, Vol.47, Issue 2, Pages 69-77.

[49] Christian Thomsen, Torben Bach Pedersen (2009)� A Survey of Open Source Tools for Business Intelligence � International Journal of Data Warehousing and Mining, Vol. 5, Issue 3, IGI Global.

[50] Meryem Duygun Fethi, Fotios Pasiouras (2010) �Assessing Bank Efficiency and Performance with Operational Research and Artificial Intelligence Techniques: A survey� European Journal of Operational Research, pp. 189�198, Elsevier.

[51] Rafael Berlanga, Oscar Romero, Alkis Simitsis, Victoria Nebot, Torben Bach Pedersen, Alberto Abell�, Mar�a Jos� Aramburu (2012 ) � Semantic Web Technologies for Business Intelligence � IGI.

[52] Manuel Mej�a-Lavalle, Ricardo Sosa R., Nemorio Gonz�lez M., and Liliana Argotte R. (2009) �Survey of Business Intelligence for Energy Markets� E. Corchado et al. (Eds.): HAIS, LNAI 5572, pp. 235�243, Springer-Verlag Berlin Heidelberg.

[53] Shantanu Godbole, Shourya Roy, (2008) � Text Classification, Business Intelligence, and Interactivity: Automating C-Sat Analysis for Services Industry � KDD�08, ACM Las Vegas, USA.

[54] Carlos Rodr�guez, Florian Daniel, F. Casati, Cinzia Cappiello (2010) � Toward Uncertain Business Intelligence: The Case of Key Indicators � Internet Computing, IEEE, vol.14, no.4, pp.32-40.

[55] K.A. Taipale (2003) “ Data Mining and Domestic Security: Connecting the Dots to Make Sense of Data ” Columbia Science and Technology Law Review 5.

[56] Will Hedfield (2009) �Case study: Jaeger uses data mining to reduce losses from crime and waste�.

[57] K. Laundon and J. Laundon (2011) �Foundations of Business Intelligence: Databases and Information Management� Managing Information Systems: Managing the Digital Firm, Pearson Education Inc.

[58] Oksana Grabova, Jerome Darmont, Jean-Hugues Chauchat, Iryna Zolotaryova (2010) � Business Intelligence for Small and Middle-Sized Enterprises � SIGMOD Rec. 39.

[59] Byung-Kwon Park and Il-Yeol Song (2011) �Toward total business intelligence incorporating structured and unstructured data� In Proceedings of the 2nd International Workshop on Business intelligencE and the WEB (BEWEB ’11), ACM, NY, USA.

[60] Y. Li, M.R. Kramer, A.J.M. Beulens, J.G.A.J. van der Vorst (2010) � A Framework for Early Warning and Proactive Control Systems in Food Supply Chain Networks � Computers in Industry, Vol. 61, Issue 9, pp. 852-862.

[61] MAIA Intelligence (2009) �Business Intelligence in Manufacturing�.

[62] Srinivasa Rao P, Saurabh Swarup (2001) �Business Intelligence and Logistics� Wipro Technologies.

[63] Atos, (2011) �Business Intelligence solutions: Decisions that are Better-Informed Leading to LongTerm Competitive Advantage�.

[64] K. Laundon and J. Laundon (2012) �Enhancing Decision Making� Managing Information Systems: Managing the Digital Firm, Pearson Education, Pearson Hall.

[65] INSEAD, World Economic Forum (2009) �The Global Information Technology Report 2008�2009: Mobility in a Networked World�, Geneva.

[66] Aura-Mihaela Mocanu, Daniela Litan, Stefan Olaru, A. Munteanu (2010) � Information Systems in the Knowledge Based Economy � WSEAS Transactions on Business and Economics, Issue 1, Vol. 7

[67] A. S. Al- Mudimigh, F. Saleem, Z. Ullah, F. N. Al-Aboud (2009) �Implementation of Data Mining Engine on CRM -Improve Customer Satisfaction� International Conference on Information and Communication Technologies ICICT ’09, vol., no., pp.193-197.

[68] Case study by Zap Technology, (2010) �KFC/Pizza Hut makes efficiency gains with Zap Business Intelligence: Businesses become more agile, responsive and performance-focused�.

Citation Count: 21

Applications of data mining techniques in life insurance.

A. B. Devale 1 and R. V. Kulkarni 2

1 Arts, Commerce, Science College, Palus Dist. Sangli, Maharashtra and 2 Shahu Institute of Business Research, Kolhapur, Maharashtra

Knowledge discovery in financial organization have been built and operated mainly to support decision making using knowledge as strategic factor. In this paper, we investigate the use of various data mining techniques for knowledge discovery in insurance business. Existing software are inefficient in showing such data characteristics. We introduce different exhibits for discovering knowledge in the form of association rules, clustering, classification and correlation suitable for data characteristics. Proposed data mining techniques, the decision- maker can define the expansion of insurance activities to empower the different forces in existing life insurance sector.

Insurance, Association rules, Clustering, Classification, Correlation, Data mining.

[1] Alex Berson and Stephen J. Smith, � Data Warehousing, Data Mining, And OLAP �,MC Graow�Hill, 1997.

[2] Bigus and Joseph P, �Data Mining With Neural Networks�, MC Graw�Hill, New York 1996.

[3] Christopher J. Matheus, Gregory Piatetshy�Shapiro and Dwight Mcneill�, Selecting and Reporting what is Interesting The Kefir Application to Health Care Data�, Advances in Knowledge Discovery and Data Mining, AAA1 Press/The MIT Press, 1996.

[4] Dasrathy B. V., Ed, � Nearest Neighbor Norms: NN Pattern Classification Techniques �,IEEE, Computer Society Press, Calif. 1990.

[5] David Cheung, Vincent T., Ada W. Fu and Yongjian Fv, � Efficient Mining of Association Rules in Distributed Databases �, IEEE, 1996.

[6] Graig Silverstein, Sergey Brin and Rajeev Montwani, �Beyond Market Baskets: Generalizing Association Rules to Dependence Rules�, Data Mining and Knowledge Discovery, Vol. 2, No. 1, Jan 1998, Kluwer Academic Publishers.

[7] Hongjun LU, Ling Feng and Jiawei Han, �Beyond Intratransaction Association Analysis: Mining Multidimensional Intertransaction Association Rules�, ACM Transactions on Information Systems, Vol. 18, October 2000.

[8] Huan Liu, Farhad Hussain, Chew Lim Tan and Manoranjan Dash, � Discretization: An Enabling Technique “, Data Mining and Knowledge Discovery�, vol. 6 No. 4, October 2002.

[9] J. Date, “ An Introduction to Database Systems “, Addition Wesley Longman, Seven Edition, 2000.

[10] Jiawei Han, Laks V. S. Lakshmanan and Raymond T.NG, �Constraint-Based Multidimensional Data Mining�, IEEE, August 1999.

[11] Jorg-Uwe Kietz, Regina Zucker and Anca Vaduva, �Mining Mart: Combining Case- Based Reasoning and multi-Strategy Learning Into a Frame For Reusing KDD-Applications�, Proc 5th Workshop on Multi-Strategy Learning (MSL 2000) Portugal, June 2000, Kluwer Academic Publishers.

[12] Ken Orr, �Data Warehouse Technology�, Copyright. The Ken Or Institute, 1997.

[13] Krzysztof J. Cios, Witold Pedryez and Roman W. Surniarski, � Data Mining Methods for Knowledge Discovery�, Kluwer Academic Publishers 1998 Second Printing 2000.

[14] Mariano Fernendez Lopez, Asuncion Gomez-Perez, Juan Pazos Sierra, Polytechnic and Alejandro Pazos Sierra, � Building a Chemical Ontology Using Methontology and the Ontology Design Environment �, IEEE Intelligent System. Jan / Feb 1999.

[15] Martin Staudt, Anca Vaduva and Thomas c, � Metadata Management and Data Warehouse “, Technical Report, Information System Research, Swiss Life, University of Zurich, Department of Computer Science, July 1999. [email protected]

[16] Ming-Syan chen, Jiawei Han and Philip S. Yu, � Data Mining: An Overview From a Database Perspective �, IEEE Transactions on Knowledge and Data Engineering Vol. 8, No. 6, Dec. 1996.

[17] Natalya Friedman Noy and Carole D. Hafner, � The State of The Art in Ontology Design �, AI Magazine Vol. 18, No. 3, Fall 1997.

[18] Rakesh A. grawal, � Parallel Mining of Associations Rule �, IEEE, Dec 1996.

[19] Ramakrishnan Srikant and Rakesh A. Grawal, � Mining Quantitative Association Rules in Large Relational Tables �, Proc Sigmod �96, 6/96 Montreal Canada, 1996 ACM.

[20] Ramakrishnan Srikant and Rakesh A. Grawal, � Mining Generalized Association Rules “, Proceedings of The �21st VLDB Conference�, Zurich, Switzerland, 1995.

[21] Raymond T. Ng, Laks V. S. Lakshmanan, Jiawei Hon and Alex Pany, � Exploratory Mining and Pruning Optimizations of Constrained Associations Rules �, ACM 1998 page 13.

[22] Mr. A. B. Devale and Dr. R. V. Kulkarni � A REVIEW OF DATA MINING TECHNIQUES IN INSURANCE SECTOR � Golden Research Thoughts Vol – I , ISSUE – VII [ January 2012 ]

We’re fighting to restore access to 500,000+ books in court this week. Join us!

Internet Archive Audio

data mining research papers free download

  • This Just In
  • Grateful Dead
  • Old Time Radio
  • 78 RPMs and Cylinder Recordings
  • Audio Books & Poetry
  • Computers, Technology and Science
  • Music, Arts & Culture
  • News & Public Affairs
  • Spirituality & Religion
  • Radio News Archive

data mining research papers free download

  • Flickr Commons
  • Occupy Wall Street Flickr
  • NASA Images
  • Solar System Collection
  • Ames Research Center

data mining research papers free download

  • All Software
  • Old School Emulation
  • MS-DOS Games
  • Historical Software
  • Classic PC Games
  • Software Library
  • Kodi Archive and Support File
  • Vintage Software
  • CD-ROM Software
  • CD-ROM Software Library
  • Software Sites
  • Tucows Software Library
  • Shareware CD-ROMs
  • Software Capsules Compilation
  • CD-ROM Images
  • ZX Spectrum
  • DOOM Level CD

data mining research papers free download

  • Smithsonian Libraries
  • FEDLINK (US)
  • Lincoln Collection
  • American Libraries
  • Canadian Libraries
  • Universal Library
  • Project Gutenberg
  • Children's Library
  • Biodiversity Heritage Library
  • Books by Language
  • Additional Collections

data mining research papers free download

  • Prelinger Archives
  • Democracy Now!
  • Occupy Wall Street
  • TV NSA Clip Library
  • Animation & Cartoons
  • Arts & Music
  • Computers & Technology
  • Cultural & Academic Films
  • Ephemeral Films
  • Sports Videos
  • Videogame Videos
  • Youth Media

Search the history of over 866 billion web pages on the Internet.

Mobile Apps

  • Wayback Machine (iOS)
  • Wayback Machine (Android)

Browser Extensions

Archive-it subscription.

  • Explore the Collections
  • Build Collections

Save Page Now

Capture a web page as it appears now for use as a trusted citation in the future.

Please enter a valid web address

  • Donate Donate icon An illustration of a heart shape

Data mining for dummies

Bookreader item preview, share or embed this item, flag this item for.

  • Graphic Violence
  • Explicit Sexual Content
  • Hate Speech
  • Misinformation/Disinformation
  • Marketing/Phishing/Advertising
  • Misleading/Inaccurate/Missing Metadata

[WorldCat (this item)]

plus-circle Add Review comment Reviews

9 Favorites

Better World Books

DOWNLOAD OPTIONS

No suitable files to display here.

IN COLLECTIONS

Uploaded by station65.cebu on November 4, 2022

SIMILAR ITEMS (based on metadata)

DATA MINING IEEE PAPERS AND PROJECTS-2020

Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems

Data stream mining , as its name suggests, is connected with two basic fields of computer science, ie data mining and data streams. Data mining [1 4] is an interdisciplinary subfield of computer science whose main aim is to develop tools and methods for exploring

FREE IEEE PAPER AND PROJECTS

Ieee projects 2022, seminar reports, free ieee projects ieee papers.

  • Open Source Software
  • Business Software
  • For Vendors
  • SourceForge Podcast
  • Site Documentation
  • Subscribe to our Newsletter
  • Support Request

Search Results for "data mining research papers"

  • Most Popular
  • Last Updated

Showing 125 open source projects for "data mining research papers"

Awesome Fraud Detection Research Papers

Awesome Fraud Detection Research Papers

A curated list of data mining papers about fraud detection

Apache Sedona

Apache Sedona

Cluster computing framework for processing large-scale geospatial data

Karate Club

Karate Club

An API Oriented Open-source Python Framework for Unsupervised Learning

PyVista

3D plotting and mesh analysis through a streamlined interface

Qlib

Qlib is an AI-oriented quantitative investment platform

ThuThesis

LaTeX Thesis Template for Tsinghua University

Fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python

Open pharmacovigilance data extraction, mining and analysis tool

UnBBayes

Framework & GUI for Bayes Nets and other probabilistic models.

Leader badge

The Lemur Project

Search engine and data mining applications and ClueWeb datasets.

Auditory Modeling Toolbox

Auditory Modeling Toolbox

Bitmagic library.

Compressed bit-sets, sparse bit matrices and algorithms

Habfuzz

A command-line tool for data -driven fuzzy modelling

stkpp

C++ Statistical ToolKit

Awesome Explainable Graph Reasoning

Awesome Explainable Graph Reasoning

A collection of research papers and software related to explainability

Open Source and data mining software

Leader badge

Visor for mass spectrometry JSON files generated by Integrator

DynaQ

Innovative text document search. http://dynaq.opendfki.de for details.

wav2letter++

wav2letter++

Facebook AI research's automatic speech recognition toolkit

Superalgos

Free and Open-Source Crypto Trading Bots Platform

Integrator

Integrates proteomic searches from multiple search engines

SequestGUI

GUI front-end to sequest.exe

EasierMGF

Converts RAW Thermo Files into MGF files

OmssaGUI

GUI front-end for OMSSA

  • You're on page 1

Related Searches

Related categories.

data mining research papers free download

IMAGES

  1. (PDF) Data mining techniques and methodologies

    data mining research papers free download

  2. (PDF) Research on Web Data Mining

    data mining research papers free download

  3. (PDF) A Study On Applications Of Data Mining

    data mining research papers free download

  4. 😍 Data mining research paper. What are some good research topics in

    data mining research papers free download

  5. (PDF) Educational Data Mining: a Case Study

    data mining research papers free download

  6. (PDF) A review on Data Mining & Big Data Analytics

    data mining research papers free download

COMMENTS

  1. 345193 PDFs

    Explore the latest full-text research PDFs, articles, conference papers, preprints and more on DATA MINING. Find methods information, sources, references or conduct a literature review on DATA MINING

  2. (PDF) Trends in data mining research: A two-decade review using topic

    Address: 20, Myasnitskaya Street, Moscow 101000, Russia. Abstract. This work analyzes the intellectual structure of data mining as a scientific discipline. T o do this, we use. topic analysis ...

  3. Home

    Data Mining and Knowledge Discovery is a leading technical journal focusing on the extraction of information from vast databases. Publishes original research papers and practice in data mining and knowledge discovery. Provides surveys and tutorials of important areas and techniques. Offers detailed descriptions of significant applications.

  4. Data mining techniques and applications

    Data mining is also known as Knowledge Discovery in Database (KDD). It is also defined as the process which includes extracting the interesting, interpretable and useful information from the raw data. There are different sources that generate raw data in very large amount. This is the main reason the applications of data mining are increasing rapidly. This paper reviews data mining techniques ...

  5. Data Mining Methods and Obstacles: A Comprehensive Analysis

    Data mini ng is a step in the knowledge discovery in databases (KDD) procedure, a data science method. for gathering, processing, and conducting more in depth analyses of data. While they ...

  6. Data mining

    Data mining is the process of extracting potentially useful information from data sets. It uses a suite of methods to organise, examine and combine large data sets, including machine learning ...

  7. PDF Data Mining in the Real World: Experiences, Challenges, and ...

    Abstract - Data mining is used regularly in a variety of in-dustries and is continuing to gain in both popularity and ac-ceptance. However, applying data mining methods to complex real-world tasks is far from straightforward and many pitfalls face data mining practitioners. However, most research in the field tends to focus on the algorithmic ...

  8. PDF A comprehensive survey of data mining

    To take a holistic view of the research trends in the area of data mining, a comprehensive survey is presented in this paper. This paper presents a systematic and comprehensive survey of various data mining tasks and techniques. Further, various real-life applications of data mining are presented in this paper.

  9. Data Mining: Research Trends, Challenges, and Applications

    Data mining is a discipline that has an important contribution to data analysis, discovery of new meaningful knowledge, and autonomous decision making. The rough set theory offers a viable approach for decision rule extraction from data.This paper, introduces the fundamental concepts of rough set theory and other aspects of data mining, a ...

  10. The Survey of Data Mining Applications And Feature Scope

    Data mining, popularly known as Knowledge Discovery in Databases (KDD), it is the nontrivial extraction of implicit, previously unknown and potentially useful information from data in databases [3, 5]. It is actually the process of finding the hidden information/pattern of the repositories .[1,3,5].

  11. Big Data Mining and Analytics

    Profile Information. Communications Preferences. Profession and Education. Technical Interests. Need Help? US & Canada:+1 800 678 4333. Worldwide: +1 732 981 0060. Contact & Support. About IEEE Xplore.

  12. Data Mining: Data Mining Concepts and Techniques

    Data mining is a field of intersection of computer science and statistics used to discover patterns in the information bank. The main aim of the data mining process is to extract the useful information from the dossier of data and mold it into an understandable structure for future use. There are different process and techniques used to carry out data mining successfully.

  13. (PDF) Top 10 Datamining Papers

    DATA MINING 2.1 Definition of Data Mining Data mining is an essential step in the knowledge discovery in databases (KDD) process that produces useful patterns or models from data (Figure 2) [7]. The terms of KDD and data mining are different. KDD refers to the overall process of discovering useful knowledge from data.

  14. PDF CS145: INTRODUCTION TO DATA MINING

    Multi-Dimensional View of Data Mining • Data to be mined • Database data (extended -relational, object -oriented, heterogeneous, legacy), data warehouse, transactional data, stream, spatiotemporal, time-series, sequence, text and web, multi -media, graphs & social and information networks • Knowledge to be mined (or: Data mining functions

  15. (PDF) Web Mining: A Survey of Current Research, Techniques, and

    1. Introduction In the data mining communities, there are three types of mining: data mining, web mining, and text mining.25 There are many challenging problems in data/web/text mining research.53 Data mining mainly deals with structured data organized in a database (DB) while text mining mainly handles unstructured data/text.

  16. The plan to mine the world's research papers

    The power of data mining. ... Nehru University in New Delhi to extract text and images from 73 million research papers. ... sources that provide free-to-download versions of papers (such as PubMed ...

  17. (PDF) Data mining techniques and applications

    Data Mining Algorithms and Techniques. Various algorithms and techniques like Classification, Clustering, Regression, Artificial. Intelligence, Neural Networks, Association Rules, Decision Trees ...

  18. Top 10 Data Mining Papers: Recommended Reading

    Research in data mining continues growing in business and in learning organization over coming decades. This review paper explores the applications of data mining techniques which have been developed to support knowledge management process. The journal articles indexed in ScienceDirect Database from 2007 to 2012 are analyzed and classified ...

  19. Data mining for dummies : Seltzer, Meta Brown : Free Download, Borrow

    Catching the data-mining train -- A day in your life as a data miner -- Teaming up to reach your goals -- Learning the laws of data mining -- Embracing the data-mining process -- Planning for data-mining success -- Gearing up with the right software -- Digging into your data -- Making new data -- Ferreting out public data sources -- Buying data ...

  20. Data Mining Ieee Papers and Projects-2020

    free download The data mining is defined as a process used to extract usable data from a larger set of any raw data . It implies analysing data patterns in large batches of data using one or more software. Data mining has applications in multiple fields, like science and research. As an Breast cancer prediction using Data mining free download

  21. (PDF) A Review Paper on Big Data and Data Mining ...

    A REVIEW PAPER ON BIG DATA AND DATA. MINING. Concepts and Techniques. Prasdika F.B.S 1, Dr. Bambang Sugiantoro, S.Si., M.T 2. Research Scholar 1, Faculty Advisor 2. Faculty of Science and ...

  22. data mining research papers free download

    Open Source and data mining software. ADaMSoft is a free and Open Source Data Mining software developed in Java. It contains data management methods and it can create ready to use reports. It can read data from several sources and it can write the results in different formats.