The 7 Most Important Data Mining Techniques

clique na imagem para seguir o link

clique na imagem para seguir o link

Pequena introdução a ulguns dos métodos mais usados em data mining

Data mining is the process of looking at large banks of information to generate new information. Intuitively, you might think that data “mining” refers to the extraction of new data, but this isn’t the case; instead, data mining is about extrapolating patterns and new knowledge from the data you’ve already collected.

Relying on techniques and technologies from the intersection of database management, statistics, and machine learning, specialists in data mining have dedicated their careers to better understanding how to process and draw conclusions from vast amounts of information. But what are the techniques they use to make this happen?

Tags:

Playground to Politics

clique no ícon para seguir o link

clique no ícon para seguir o link

Dados de um questionário a 50 professores londrinos.

A study of values and attitudes among fifth formers in a North London comprehensive school.

This survey of teenage attitudes and opinions in a North London comprehensive school (11-18 mixed) was designed and conducted, under my guidance and supervision, by three of my sophomore students as part of their group research dissertation for BA Applied Social Studies (Social Research) at the Polytechnic of North London (PNL, now part of London Metropolitan University).  . It aimed to discover something about pupils’ future expectations and awareness of, and attitudes towards, various current social issues and problems, particularly racism and sexism. It replicates various items and scales from other work (Wilson-Patterson, Eysenck, Himmelweit, Srole-Christie) particularly the St Paul’s Girls senior pupils study (Feb 1973) some of which were also used in the SSRC Survey Unit Quality of Life surveys 1971-75.

The self-completion questionnaire was completed in December 1981 by all fifth form pupils present on the day of the survey (N=142).  It was administered during time-tabled Social Studies classes and, time permitting, was followed by discussion with class teachers and the PNL students of the issues covered in the survey.

Given the particularly high quality of this project, a user manual was prepared by John Hall and Alison Walker for use with the postgraduate Survey Analysis Workshop and the undergraduate course Data Management and Analysis. It serves as model documentation for similar small survey projects.

Tags:

curso de KNIME

clicar na imagem para seguir o link

clicar na imagem para seguir o link

Muito bom curso de KNIME, é introdutório mas introduz um grande número de funcionalidades.

KNIME Online Self-Training

Welcome to the KNIME Self-training course. The focus of this document is to get you started with KNIME as quickly as possible and guide you through essential steps of advanced analytics with KNIME. Optional and very useful topics such as reporting, KNIME Server and database handling are also included to give you an idea of what else is possible with KNIME.

  1. Installing KNIME Analytics Platform and Extensions
  2. Data Import / Export and Database / Big Data
  3. ETL
  4. Visualization
  5. Advanced Analytics
  6. Reporting
  7. KNIME Server

Tags: , , , ,

MARS – Multivariate Adaptive Regression Splines

clique na imagem para seguir o link

clique na imagem para seguir o link

Boa descrição destes algoritmos de análise de dados pelos proprios autores

An Overview of MARS

What is “MARS”?

MARS®, an acronym for Multivariate Adaptive Regression Splines, is a multivariate non-parametric regression procedure introduced in 1991 by world-renowned Stanford statistician and physicist, Jerome Friedman (Friedman, 1991). Salford Systems’ MARS, based on the original code, has been substantially enhanced with new features and capabilities in exclusive collaboration with Friedman.

Tags: , ,

How to create a slicer in Excel

clicar para seguir o link

clicar para seguir o link

Bom tutorial de como usar umas das novas funcionalidades do Excel

For dashboards and quick filtering, you can’t beat Excel slicers. They’re easy to implement and even easier to use. Here are the basics–plus a few power tips.

Tags:

SAP video analytics

clicar para seguir o link

clicar para seguir o link

montes de vídeos sobre analytics da SAP
Digital Enterprise Platform
SAP Digital Business Services
SAPIndustry
SAPLineOfBusiness

SME Solutions and Partner Innovation

Tags: ,

MySQL Documentation

clique na imagem para seguir o link

clique na imagem para seguir o link

Montes de documentação sobre todos os produtos MySQL

Guardar

Tags:

Deeplearning4j Documentation

clique na imagem para seguir o link

clique na imagem para seguir o link

O site de um pacote java para deeplearing com montes de info. sobre redes neuronais e afins.

Guardar

Tags: , , , ,

The Many Faces of ROC Analysis

clicar na imagem para seguir o link

clicar na imagem para seguir o link

Bom tutorial sobre curvas ROC

Receiver Operating Characteristics (ROC) Analysis originated from signal detection theory, as a model of how well a receiver is able to detect a signal in the presence of noise. Its key feature is the distinction between hit rate (or true positive rate) and false alarm rate (or false positive rate) as two separate performance measures. ROC analysis has also widely been used in medical data analysis to study the effect of varying the threshold on the numerical outcome of a diagnostic test. It has been introduced to machine learning relatively recently, in response to classification tasks with varying class distributions or misclassification costs (hereafter referred to as skew). ROC analysis is set to cause a paradigm shift in machine learning. Separating performance on classes is almost always a good idea from an analytical perspective. For instance, it can help us to

  • understand the behaviour and skew-sensitivity of many machine learning metrics, including rule learning heuristics and decision tree splitting criteria, by plotting their isometrics in ROC space;
  • develop new metrics specifically designed to improve the Area Under the ROC Curve (AUC) of a model;
  • understand fundamental algorithms such as the separate-and-conquer or sequential covering rule learning algorithm, by tracing its trajectory through a sequence of ROC spaces.

The goal of this tutorial is to develop the ROC perspective in a systematic way, demonstrating the many faces of ROC analysis in machine learning.

Tags: , ,

Best Data Science Learning podcasts

KDnuggets

Muito bons podcasts tem temas introdutórios

We present the top 12 Data Science & Machine Learning related Podcasts by popularity on iTunes. Check out latest episodes to stay up-to-date & become a part of the data conversations!

By Bhavya Geethika Peddibhotla.

Learn Data science the new way by listening to these compelling story tellers, interviewers, educators and experts in the field. Data suggests that podcasting about Data Science is only growing!

Tags: , , , , ,