WEKA: Remote Experiment

permite computação distribuida usando um servidor com algoritmos WEKA

permite computação distribuída usando um servidor com algoritmos WEKA

Remote experiments enable you to distribute the computing load across multiple computers. In the following we will discuss the setup and operation for HSQLDB and MySQL.

Tags: , , ,

Survs: Ferramenta para inquéritos on-line

Boa ferramenta para construção de inquéritos on-line

Boa ferramenta para construção de inquéritos on-line

Create online surveys with your team easily and efficiently.

Survs is a web-based tool to create, distribute, and analyze online surveys. Its friendly interface and compelling features provide everything you need to get feedback.

Tags: ,

List of R Resources

muito boa lista de recursos sobre R

muito boa lista de recursos sobre R

There is a wealth of resources on the Web and elsewhere to learn more about R.  Here are some of the best.

Tags: , , ,

LIBSVM — A Library for Support Vector Machines

Página dos autores da biblioteca LIBSVM, a mais usada para SVM

Página dos autores da biblioteca LIBSVM, a mais usada para SVM

LIBSVM — A Library for Support Vector Machines

Chih-Chung Chang and Chih-Jen Lin


Version 3.17 released on April Fools’ day, 2013. We slightly adjust the way class labels are handled internally. By default labels are ordered by their first occurrence in the training set. Hence for a set with -1/+1 labels, if -1 appears first, then internally -1 becomes +1. This has caused confusion. Now for data with -1/+1 labels, we specifically ensure that internally the binary SVM has positive data corresponding to the +1 instances. For developers, see changes in the subrouting svm_group_classes of svm.cpp.
We now have a nice page LIBSVM data sets providing problems in LIBSVM format.
A practical guide to SVM classification is available now! (mainly written for beginners)
LIBSVM tools available now!
We now have an easy script (easy.py) for users who know NOTHING about svm. It makes everything automatic–from data scaling to parameter selection.
The parameter selection tool grid.py generates the following contour of cross-validation accuracy. To use this tool, you also need to install python and gnuplot.

Tags: , , , , ,

Cross-validation in RapidMiner

Explica como utilizar a validação cruzada no RapidMiner

Explica como utilizar a validação cruzada no RapidMiner

Cross-validation is a standard statistical method to estimate the generalization error of a predictive model. In k-fold cross-validation a training set is divided into k equal-sized subsets. Then the following procedure is repeated for each subset: a model is built using the other (k - 1) subsets as the training set and its performance is evaluated on the current subset. This means that each subset is used for testing exactly once. The result of the cross-validation is the average of the performances obtained from the k rounds.

This post explains how to interpret cross-validation results in RapidMiner.

Tags: , ,

wekalist – resposta a questões

Um fórum de discussão sobre os algoritmos do WEKA

Um fórum de discussão sobre os algoritmos do WEKA

WEKA

This forum is an archive for the mailing list wekalist@list.scms.waikato.ac.nz (more options) Messages posted here will be sent to this mailing list.

WEKA machine learning software discussion

Tags: , ,

WEKA Cost Benefit Analysis

Análise de custo benefício para avaliação de modelos

Análise de custo benefício para avaliação de modelos

The Cost/Benefit analysis component is a new visualization tool that was released in Weka versions 3.6.2 and 3.7.1. The tool is particularly useful for the analysis of predictive analytic outcomes for direct mail campaigns (or any ranking application where costs are involved). It allows the user to explore various cost/benefit tradeoffs by interactively selecting different population sizes from the ranked list of prospects or by varying the threshold on the predicted probability of the positive class.

The Cost/Benefit analysis tool is available from both the Explorer and Knowledge Flow user interfaces. In the figure below, the Knowledge Flow is being used to build a predictive model for a real-world direct mail application. The data is historical campaign data from a mail out to solicit donations to a charitable organization. The data set contains 47,706 records with 476 variables (summary variables for donor lifetime giving history, overlay demographics etc.). The percentage of donors in the data is approximately 5%. A 10-fold cross-validation is used to generate predictions from a naive Bayes classifier, and these are then passed to the Cost/Benefit analysis tool.

Tags: ,

Mathematical Programming in MathProg

Interface para resolução de probs PL ou MIP com java

Interface para resolução de probs PL ou MIP com java

GNU MathProg (part of the open source GNU GLPK project) is a modeling language for describing linear and discrete optimization problems. Use this page to create and solve MathProg models using the glpk.js solver. You can either open a model from your computer, select an example from the menu bar, or create your own from scratch. To use —

  1. Create a MathProg model in the Model Editor.
  2. Click ‘Solve’ to generate the solution.
  3. Click ‘Save As…’ to save the model locally to your computer.

Tags: ,

How to: network animation with R and the iGraph

Construir animações de iformação relacional com R

Construir animações de informação relacional com R

This article lists the steps I take to create a network animation in R, provides some example source code that you can copy and modify for your own work, and starts a discussion about programming and visualization as an interpretive approach in research. Before I start, take a look at this network animation created with R and the iGraph package. This animation is of a retweet network related to #BankTransferDay. Links (displayed as lines) are retweets, nodes (displayed as points) are user accounts. For each designated period of time (in this case, an hour), retweets are drawn and then fade out over 24 hours.

Tags: , , ,

Advanced Statistics

Bons slides e outros materiais sobre clusters, AFE, SEM, reg logistica, meta-análise, MANOVA, Reliability

Bons slides e outros materiais sobre clusters, AFE, SEM, reg logistica, meta-análise, MANOVA, Reliability

Welcome to Malbowges, the part of Nether Hell dominated by thieves, counsellors of Fraud (or should that just be counsellors), falsifiers and sowers of discord. It’s not a nice place for Sunday lunch. You must wade through rivers of Lucifer’s sputum to reach the answers you seek, and when you find those answers, you’ll probably wish you hadn’t bothered. Revenge is mine, ah ha ha, yah ha ha, ya ha ha ha ha ha ha ha ha ha …

Tags: , , , , ,