Data Mining with Weka MOOC
Posted by Armando Brito Mendes | Filed under Habilitações Académicas, materiais ensino, software, videos
Welcome to the free online course Data Mining with Weka
This 5 week MOOC introduced data mining concepts through practical experience with the free Weka tool.
The course featured:
- video lectures by Professor Ian H. Witten
- the open-source Weka data mining platform
- access to chapters from Data Mining (3rd Edition)
- discounts from Morgan Kaufmann
- online assessment leading to a statement of completion
The course will run again in early March 2014. To get notified about dates (enrolment, commencement), please subscribe to the announcement forum.
You can access the course material (videos, slides, etc) from here.
Tags: data mining, software estatístico, WEKA
WEKA: Remote Experiment
Posted by Armando Brito Mendes | Filed under software
Remote experiments enable you to distribute the computing load across multiple computers. In the following we will discuss the setup and operation for HSQLDB and MySQL.
Tags: análise de dados, data mining, DW \ BI, WEKA
LIBSVM — A Library for Support Vector Machines
Posted by Armando Brito Mendes | Filed under software
LIBSVM — A Library for Support Vector Machines
Chih-Chung Chang and Chih-Jen Lin
Version 3.17 released on April Fools’ day, 2013. We slightly adjust the way class labels are handled internally. By default labels are ordered by their first occurrence in the training set. Hence for a set with -1/+1 labels, if -1 appears first, then internally -1 becomes +1. This has caused confusion. Now for data with -1/+1 labels, we specifically ensure that internally the binary SVM has positive data corresponding to the +1 instances. For developers, see changes in the subrouting svm_group_classes of svm.cpp.
We now have a nice page LIBSVM data sets providing problems in LIBSVM format.
A practical guide to SVM classification is available now! (mainly written for beginners)
LIBSVM tools available now!
We now have an easy script (easy.py) for users who know NOTHING about svm. It makes everything automatic–from data scaling to parameter selection.
The parameter selection tool grid.py generates the following contour of cross-validation accuracy. To use this tool, you also need to install python and gnuplot.
Tags: captura de conhecimento, data mining, otimização, R-software, RapidMiner, WEKA
wekalist – resposta a questões
Posted by Armando Brito Mendes | Filed under software
WEKA
WEKA machine learning software discussion
Tags: captura de conhecimento, data mining, WEKA
WEKA Cost Benefit Analysis
Posted by Armando Brito Mendes | Filed under SAD - DSS, software, visualização
The Cost/Benefit analysis component is a new visualization tool that was released in Weka versions 3.6.2 and 3.7.1. The tool is particularly useful for the analysis of predictive analytic outcomes for direct mail campaigns (or any ranking application where costs are involved). It allows the user to explore various cost/benefit tradeoffs by interactively selecting different population sizes from the ranked list of prospects or by varying the threshold on the predicted probability of the positive class.
The Cost/Benefit analysis tool is available from both the Explorer and Knowledge Flow user interfaces. In the figure below, the Knowledge Flow is being used to build a predictive model for a real-world direct mail application. The data is historical campaign data from a mail out to solicit donations to a charitable organization. The data set contains 47,706 records with 476 variables (summary variables for donor lifetime giving history, overlay demographics etc.). The percentage of donors in the data is approximately 5%. A 10-fold cross-validation is used to generate predictions from a naive Bayes classifier, and these are then passed to the Cost/Benefit analysis tool.
Tags: data mining, WEKA
Weka: Decision Trees Tutorial
Posted by Armando Brito Mendes | Filed under Sem categoria
- Tutorial (1): A simple decision tree
- Tutorial (2): Exercise 1
- Tutorial (3): Occam’s Razor
- Tutorial (4): ID3
- Tutorial (5): Exercise 2
- Tutorial (6): Entropy Bias
- Tutorial (7): Exercise 3
- Tutorial (8): Other Splitting Criteria
- Tutorial (9): Exercise 4
- Tutorial (10): Advanced Topics
- Tutorial (11): Evaluating Decision Trees
- Tutorial (12): Exercise 5
- Tutorial (13): Overfitting
- Tutorial (14): Pruning
- Tutorial (15): Exercise 6
- Tutorial (16): Further Topics
- Tutorial (17): Conclusion
Tags: data mining, WEKA
Text categorization with Weka
Posted by Armando Brito Mendes | Filed under software
Table of Contents
Tags: data mining, text mining, WEKA
Weka Video Tutorials
Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, videos
This videos contains many answers that many people asked about.
Alguns não estão a funcionar.
Tags: data mining, WEKA