WEKA: Remote Experiment

Posted by Armando Brito Mendes | Filed under software

permite computação distribuida usando um servidor com algoritmos WEKA

permite computação distribuída usando um servidor com algoritmos WEKA

Remote experiments enable you to distribute the computing load across multiple computers. In the following we will discuss the setup and operation for HSQLDB and MySQL.

Tags: análise de dados, data mining, DW \ BI, WEKA

Read more | Comments off | June 28th, 2013

Survs: Ferramenta para inquéritos on-line

Posted by Armando Brito Mendes | Filed under estatística, software

Boa ferramenta para construção de inquéritos on-line

Create online surveys with your team easily and efficiently.

Survs is a web-based tool to create, distribute, and analyze online surveys. Its friendly interface and compelling features provide everything you need to get feedback.

Tags: inquéritos, software estatístico

Read more | Comments off | June 27th, 2013

List of R Resources

Posted by Armando Brito Mendes | Filed under estatística, materiais para profissionais, software

muito boa lista de recursos sobre R

There is a wealth of resources on the Web and elsewhere to learn more about R. Here are some of the best.

Tags: data mining, Estat Descritiva, R-software, software estatístico

Read more | Comments off | June 13th, 2013

LIBSVM — A Library for Support Vector Machines

Posted by Armando Brito Mendes | Filed under software

Página dos autores da biblioteca LIBSVM, a mais usada para SVM

LIBSVM — A Library for Support Vector Machines

Chih-Chung Chang and Chih-Jen Lin

Version 3.17 released on April Fools’ day, 2013. We slightly adjust the way class labels are handled internally. By default labels are ordered by their first occurrence in the training set. Hence for a set with -1/+1 labels, if -1 appears first, then internally -1 becomes +1. This has caused confusion. Now for data with -1/+1 labels, we specifically ensure that internally the binary SVM has positive data corresponding to the +1 instances. For developers, see changes in the subrouting svm_group_classes of svm.cpp.
We now have a nice page LIBSVM data sets providing problems in LIBSVM format.
A practical guide to SVM classification is available now! (mainly written for beginners)
LIBSVM tools available now!
We now have an easy script (easy.py) for users who know NOTHING about svm. It makes everything automatic–from data scaling to parameter selection.
The parameter selection tool grid.py generates the following contour of cross-validation accuracy. To use this tool, you also need to install python and gnuplot.

Tags: captura de conhecimento, data mining, otimização, R-software, RapidMiner, WEKA

Read more | Comments off | May 30th, 2013

Cross-validation in RapidMiner

Posted by Armando Brito Mendes | Filed under software

Explica como utilizar a validação cruzada no RapidMiner

Cross-validation is a standard statistical method to estimate the generalization error of a predictive model. In $k$ -fold cross-validation a training set is divided into $k$ equal-sized subsets. Then the following procedure is repeated for each subset: a model is built using the other $(k - 1)$ subsets as the training set and its performance is evaluated on the current subset. This means that each subset is used for testing exactly once. The result of the cross-validation is the average of the performances obtained from the $k$ rounds.

This post explains how to interpret cross-validation results in RapidMiner.

Tags: captura de conhecimento, data mining, RapidMiner

Read more | Comments off | May 24th, 2013

wekalist – resposta a questões

Posted by Armando Brito Mendes | Filed under software

Um fórum de discussão sobre os algoritmos do WEKA

WEKA

This forum is an archive for the mailing list wekalist@list.scms.waikato.ac.nz (more options) Messages posted here will be sent to this mailing list.

WEKA machine learning software discussion

Tags: captura de conhecimento, data mining, WEKA

Read more | Comments off | May 23rd, 2013

WEKA Cost Benefit Analysis

Posted by Armando Brito Mendes | Filed under SAD - DSS, software, visualização

Análise de custo benefício para avaliação de modelos

The Cost/Benefit analysis component is a new visualization tool that was released in Weka versions 3.6.2 and 3.7.1. The tool is particularly useful for the analysis of predictive analytic outcomes for direct mail campaigns (or any ranking application where costs are involved). It allows the user to explore various cost/benefit tradeoffs by interactively selecting different population sizes from the ranked list of prospects or by varying the threshold on the predicted probability of the positive class.

The Cost/Benefit analysis tool is available from both the Explorer and Knowledge Flow user interfaces. In the figure below, the Knowledge Flow is being used to build a predictive model for a real-world direct mail application. The data is historical campaign data from a mail out to solicit donations to a charitable organization. The data set contains 47,706 records with 476 variables (summary variables for donor lifetime giving history, overlay demographics etc.). The percentage of donors in the data is approximately 5%. A 10-fold cross-validation is used to generate predictions from a naive Bayes classifier, and these are then passed to the Cost/Benefit analysis tool.

Tags: data mining, WEKA

Read more | Comments off | February 25th, 2013

Mathematical Programming in MathProg

Posted by Armando Brito Mendes | Filed under Investigação Operacional, SAD - DSS, software

Interface para resolução de probs PL ou MIP com java

GNU MathProg (part of the open source GNU GLPK project) is a modeling language for describing linear and discrete optimization problems. Use this page to create and solve MathProg models using the glpk.js solver. You can either open a model from your computer, select an example from the menu bar, or create your own from scratch. To use —

Create a MathProg model in the Model Editor.
Click ‘Solve’ to generate the solution.
Click ‘Save As…’ to save the model locally to your computer.

Tags: otimização, software de otimização

Read more | Comments off | January 10th, 2013

How to: network animation with R and the iGraph

Posted by Armando Brito Mendes | Filed under ARS - SNA, software

Construir animações de iformação relacional com R

Construir animações de informação relacional com R

This article lists the steps I take to create a network animation in R, provides some example source code that you can copy and modify for your own work, and starts a discussion about programming and visualization as an interpretive approach in research. Before I start, take a look at this network animation created with R and the iGraph package. This animation is of a retweet network related to #BankTransferDay. Links (displayed as lines) are retweets, nodes (displayed as points) are user accounts. For each designated period of time (in this case, an hour), retweets are drawn and then fade out over 24 hours.

Tags: captura de conhecimento, data mining, grafos, R-software

Read more | Comments off | January 9th, 2013

Advanced Statistics

Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, software, visualização

Bons slides e outros materiais sobre clusters, AFE, SEM, reg logistica, meta-análise, MANOVA, Reliability

Welcome to Malbowges, the part of Nether Hell dominated by thieves, counsellors of Fraud (or should that just be counsellors), falsifiers and sowers of discord. It’s not a nice place for Sunday lunch. You must wade through rivers of Lucifer’s sputum to reach the answers you seek, and when you find those answers, you’ll probably wish you hadn’t bothered. Revenge is mine, ah ha ha, yah ha ha, ya ha ha ha ha ha ha ha ha ha …

Tags: captura de conhecimento, data mining, decisão médica, IBM SPSS Statistics, inferência, software estatístico

Read more | Comments off | January 9th, 2013

« Older Entries

Newer Entries »

Armando B. Mendes

WEKA: Remote Experiment

Survs: Ferramenta para inquéritos on-line

Create online surveys with your team easily and efficiently.

Survs is a web-based tool to create, distribute, and analyze online surveys. Its friendly interface and compelling features provide everything you need to get feedback.

List of R Resources

LIBSVM — A Library for Support Vector Machines

LIBSVM — A Library for Support Vector Machines

Chih-Chung Chang and Chih-Jen Lin

Cross-validation in RapidMiner

wekalist – resposta a questões

WEKA

WEKA Cost Benefit Analysis

Mathematical Programming in MathProg

How to: network animation with R and the iGraph

Advanced Statistics

Categorias de Posts

Palavras chave mais usadas

Arquivo

Recent Posts

Recent Comments

About