WEKA: Remote Experiment

permite computação distribuida usando um servidor com algoritmos WEKA

permite computação distribuída usando um servidor com algoritmos WEKA

Remote experiments enable you to distribute the computing load across multiple computers. In the following we will discuss the setup and operation for HSQLDB and MySQL.

Tags: , , ,

Survs: Ferramenta para inquéritos on-line

Boa ferramenta para construção de inquéritos on-line

Boa ferramenta para construção de inquéritos on-line

Create online surveys with your team easily and efficiently.

Survs is a web-based tool to create, distribute, and analyze online surveys. Its friendly interface and compelling features provide everything you need to get feedback.

Tags: ,

List of R Resources

muito boa lista de recursos sobre R

muito boa lista de recursos sobre R

There is a wealth of resources on the Web and elsewhere to learn more about R.  Here are some of the best.

Tags: , , ,

Introduction to R for SAS and SPSS Users

algumas comparações e conselhos úteis

algumas comparações e conselhos úteis

R is free software for data analysis and graphics that is similar to SAS and SPSS.   Two million people are part of the R Open Source Community.    Its use is growing very rapidly and Revolution Analytics distributes a commercial version of R that adds capabilities that are not available in the Open Source version.   This 60-minute webinar is for people who are familiar with SAS or SPSS who want to know how R can strengthen their analytics strategy.  It will include:

  • What R is and how it compares to SAS and SPSS
  • An overview of how to install and maintain it
  • How to find R add-on modules comparable to those for SAS and SPSS
  • Which of R’s many user interfaces are most like those of SAS and SPSS
  • How to run R from within SAS and SPSS
  • What a simple R program looks like
  • Q&A with Bob Muenchen

Repaly the webcast and find out how SAS and SPSS users can take advantage of R.

Tags: , , ,

Using Metadata to Find Paul Revere

Um exemplo de um estudo de redes sociais

Um exemplo de um estudo de redes sociais

It’s just metadata. What can you do with that? Kieran Healy, a sociology professor at Duke University, shows what you can do, with just some basic social network analysis. Using metadata from Paul Revere’s Ride on the groups that people belonged to, Healy sniffs out Paul Revere as a main target. Bonus points for writing the summary from the point of a view of an 18th century analyst.

What a nice picture! The analytical engine has arranged everyone neatly, picking out clusters of individuals and also showing both peripheral individuals and—more intriguingly—people who seem to bridge various groups in ways that might perhaps be relevant to national security. Look at that person right in the middle there. Zoom in if you wish. He seems to bridge several groups in an unusual (though perhaps not unique) way. His name is Paul Revere.

You can grab the R code and dataset on github, too, if you want to follow along.

Tags: , , ,

Stupid Calculations

Porque os cálculos tb podem ser divertidos

Porque os cálculos tb podem ser divertidos

Josh Orter takes back-of-the-napkin math to the next level with Stupid Calculations, which promises to turn practical facts into utterly useless ones. Stupid calculation number one is the size of a giant iPhone screen if you combined all the iPhone screens ever sold into one.

The eye-glazing calculations are laid out below for those who appreciate the dirty work but, skipping ahead, the Kubrick-inspired monophone would stretch 5,059 feet into the sky and have a base measuring 2,846 feet across (Central Park is 2,640 feet wide). Its surface area would take in 2.07 billion square inches. That’s 14.39 million square feet or 330.54 acres. The new World Trade Center, by comparison, will have a surface area of 23 glass-clad acres, giving us enough screenage to watch Game of Thrones on all four sides of fourteen WTCs.

See also how long it would it take to drink the water in an olympic-sized pool through a straw.

Tags: ,

Map Blog Dashboard

Um dashboard com os videos mais vistos por região (apenas EUA)

Um dashboard com os videos mais vistos por região (apenas EUA)

Videos uploaded within 48 hours may not yet appear in age and gender breakdowns.

Tags:

Agile & Scrum Portugal

organizam eventos sobre metodolgias ageis e srcam

organizam eventos sobre metodologias ágeis e srcam

Agile & Scrum Portugal 2013 is getting ready to be another awesome event!

This year’s program combines AgilePT with the ScrumPT annual gathering, and therefore it will accommodate all interests of all agile community in Portugal

Tags:

LIBSVM — A Library for Support Vector Machines

Página dos autores da biblioteca LIBSVM, a mais usada para SVM

Página dos autores da biblioteca LIBSVM, a mais usada para SVM

LIBSVM — A Library for Support Vector Machines

Chih-Chung Chang and Chih-Jen Lin


Version 3.17 released on April Fools’ day, 2013. We slightly adjust the way class labels are handled internally. By default labels are ordered by their first occurrence in the training set. Hence for a set with -1/+1 labels, if -1 appears first, then internally -1 becomes +1. This has caused confusion. Now for data with -1/+1 labels, we specifically ensure that internally the binary SVM has positive data corresponding to the +1 instances. For developers, see changes in the subrouting svm_group_classes of svm.cpp.
We now have a nice page LIBSVM data sets providing problems in LIBSVM format.
A practical guide to SVM classification is available now! (mainly written for beginners)
LIBSVM tools available now!
We now have an easy script (easy.py) for users who know NOTHING about svm. It makes everything automatic–from data scaling to parameter selection.
The parameter selection tool grid.py generates the following contour of cross-validation accuracy. To use this tool, you also need to install python and gnuplot.

Tags: , , , , ,

Cross-validation in RapidMiner

Explica como utilizar a validação cruzada no RapidMiner

Explica como utilizar a validação cruzada no RapidMiner

Cross-validation is a standard statistical method to estimate the generalization error of a predictive model. In k-fold cross-validation a training set is divided into k equal-sized subsets. Then the following procedure is repeated for each subset: a model is built using the other (k - 1) subsets as the training set and its performance is evaluated on the current subset. This means that each subset is used for testing exactly once. The result of the cross-validation is the average of the performances obtained from the k rounds.

This post explains how to interpret cross-validation results in RapidMiner.

Tags: , ,