How and Why: Decorrelate Time Series

clique na imagem para seguir o link

clique na imagem para seguir o link

O problemas das autocorrelações nas séries cronológicas.

When dealing with time series, the first step consists in isolating trends and periodicites. Once this is done, we are left with a normalized time series, and studying the auto-correlation structure is the next step, called model fitting. The purpose is to check whether the underlying data follows some well known stochastic process with a similar auto-correlation structure, such as ARMA processes, using tools such as Box and Jenkins. Once a fit with a specific model is found, model parameters can be estimated and used to make predictions.

A deeper investigation consists in isolating the auto-correlations to see whether the remaining values, once decorrelated, behave like white noise, or not. If departure from white noise is found (using a few tests of randomness), then it means that the time series in question exhibits unusual patterns not explained by trends, seasonality or auto correlations. This can be useful knowledge in some contexts  such as high frequency trading, random number generation, cryptography or cyber-security. The analysis of decorrelated residuals can also help identify change points and instances of slope changes in time series, or reveal otherwise undetected outliers.

Tags:

The 7 Most Important Data Mining Techniques

clique na imagem para seguir o link

clique na imagem para seguir o link

Pequena introdução a ulguns dos métodos mais usados em data mining

Data mining is the process of looking at large banks of information to generate new information. Intuitively, you might think that data “mining” refers to the extraction of new data, but this isn’t the case; instead, data mining is about extrapolating patterns and new knowledge from the data you’ve already collected.

Relying on techniques and technologies from the intersection of database management, statistics, and machine learning, specialists in data mining have dedicated their careers to better understanding how to process and draw conclusions from vast amounts of information. But what are the techniques they use to make this happen?

Tags:

Playground to Politics

clique no ícon para seguir o link

clique no ícon para seguir o link

Dados de um questionário a 50 professores londrinos.

A study of values and attitudes among fifth formers in a North London comprehensive school.

This survey of teenage attitudes and opinions in a North London comprehensive school (11-18 mixed) was designed and conducted, under my guidance and supervision, by three of my sophomore students as part of their group research dissertation for BA Applied Social Studies (Social Research) at the Polytechnic of North London (PNL, now part of London Metropolitan University).  . It aimed to discover something about pupils’ future expectations and awareness of, and attitudes towards, various current social issues and problems, particularly racism and sexism. It replicates various items and scales from other work (Wilson-Patterson, Eysenck, Himmelweit, Srole-Christie) particularly the St Paul’s Girls senior pupils study (Feb 1973) some of which were also used in the SSRC Survey Unit Quality of Life surveys 1971-75.

The self-completion questionnaire was completed in December 1981 by all fifth form pupils present on the day of the survey (N=142).  It was administered during time-tabled Social Studies classes and, time permitting, was followed by discussion with class teachers and the PNL students of the issues covered in the survey.

Given the particularly high quality of this project, a user manual was prepared by John Hall and Alison Walker for use with the postgraduate Survey Analysis Workshop and the undergraduate course Data Management and Analysis. It serves as model documentation for similar small survey projects.

Tags:

curso de KNIME

clicar na imagem para seguir o link

clicar na imagem para seguir o link

Muito bom curso de KNIME, é introdutório mas introduz um grande número de funcionalidades.

KNIME Online Self-Training

Welcome to the KNIME Self-training course. The focus of this document is to get you started with KNIME as quickly as possible and guide you through essential steps of advanced analytics with KNIME. Optional and very useful topics such as reporting, KNIME Server and database handling are also included to give you an idea of what else is possible with KNIME.

  1. Installing KNIME Analytics Platform and Extensions
  2. Data Import / Export and Database / Big Data
  3. ETL
  4. Visualization
  5. Advanced Analytics
  6. Reporting
  7. KNIME Server

Tags: , , , ,

MARS – Multivariate Adaptive Regression Splines

clique na imagem para seguir o link

clique na imagem para seguir o link

Boa descrição destes algoritmos de análise de dados pelos proprios autores

An Overview of MARS

What is “MARS”?

MARS®, an acronym for Multivariate Adaptive Regression Splines, is a multivariate non-parametric regression procedure introduced in 1991 by world-renowned Stanford statistician and physicist, Jerome Friedman (Friedman, 1991). Salford Systems’ MARS, based on the original code, has been substantially enhanced with new features and capabilities in exclusive collaboration with Friedman.

Tags: , ,

How to create a slicer in Excel

clicar para seguir o link

clicar para seguir o link

Bom tutorial de como usar umas das novas funcionalidades do Excel

For dashboards and quick filtering, you can’t beat Excel slicers. They’re easy to implement and even easier to use. Here are the basics–plus a few power tips.

Tags:

SAP video analytics

clicar para seguir o link

clicar para seguir o link

montes de vídeos sobre analytics da SAP
Digital Enterprise Platform
SAP Digital Business Services
SAPIndustry
SAPLineOfBusiness

SME Solutions and Partner Innovation

Tags: ,

MySQL Documentation

clique na imagem para seguir o link

clique na imagem para seguir o link

Montes de documentação sobre todos os produtos MySQL

Guardar

Tags:

Deeplearning4j Documentation

clique na imagem para seguir o link

clique na imagem para seguir o link

O site de um pacote java para deeplearing com montes de info. sobre redes neuronais e afins.

Guardar

Tags: , , , ,

The Many Faces of ROC Analysis

clicar na imagem para seguir o link

clicar na imagem para seguir o link

Bom tutorial sobre curvas ROC

Receiver Operating Characteristics (ROC) Analysis originated from signal detection theory, as a model of how well a receiver is able to detect a signal in the presence of noise. Its key feature is the distinction between hit rate (or true positive rate) and false alarm rate (or false positive rate) as two separate performance measures. ROC analysis has also widely been used in medical data analysis to study the effect of varying the threshold on the numerical outcome of a diagnostic test. It has been introduced to machine learning relatively recently, in response to classification tasks with varying class distributions or misclassification costs (hereafter referred to as skew). ROC analysis is set to cause a paradigm shift in machine learning. Separating performance on classes is almost always a good idea from an analytical perspective. For instance, it can help us to

  • understand the behaviour and skew-sensitivity of many machine learning metrics, including rule learning heuristics and decision tree splitting criteria, by plotting their isometrics in ROC space;
  • develop new metrics specifically designed to improve the Area Under the ROC Curve (AUC) of a model;
  • understand fundamental algorithms such as the separate-and-conquer or sequential covering rule learning algorithm, by tracing its trajectory through a sequence of ROC spaces.

The goal of this tutorial is to develop the ROC perspective in a systematic way, demonstrating the many faces of ROC analysis in machine learning.

Tags: , ,