How and Why: Decorrelate Time Series

Posted by Armando Brito Mendes | Filed under estatística, Investigação Operacional, materiais para profissionais

clique na imagem para seguir o link

O problemas das autocorrelações nas séries cronológicas.

When dealing with time series, the first step consists in isolating trends and periodicites. Once this is done, we are left with a normalized time series, and studying the auto-correlation structure is the next step, called model fitting. The purpose is to check whether the underlying data follows some well known stochastic process with a similar auto-correlation structure, such as ARMA processes, using tools such as Box and Jenkins. Once a fit with a specific model is found, model parameters can be estimated and used to make predictions.

A deeper investigation consists in isolating the auto-correlations to see whether the remaining values, once decorrelated, behave like white noise, or not. If departure from white noise is found (using a few tests of randomness), then it means that the time series in question exhibits unusual patterns not explained by trends, seasonality or auto correlations. This can be useful knowledge in some contexts such as high frequency trading, random number generation, cryptography or cyber-security. The analysis of decorrelated residuals can also help identify change points and instances of slope changes in time series, or reveal otherwise undetected outliers.

Tags: previsão

Read more | Comments off | January 31st, 2018

The 7 Most Important Data Mining Techniques

Posted by Armando Brito Mendes | Filed under materiais para profissionais

clique na imagem para seguir o link

Pequena introdução a ulguns dos métodos mais usados em data mining

Data mining is the process of looking at large banks of information to generate new information. Intuitively, you might think that data “mining” refers to the extraction of new data, but this isn’t the case; instead, data mining is about extrapolating patterns and new knowledge from the data you’ve already collected.

Relying on techniques and technologies from the intersection of database management, statistics, and machine learning, specialists in data mining have dedicated their careers to better understanding how to process and draw conclusions from vast amounts of information. But what are the techniques they use to make this happen?

Tags: data mining

Read more | Comments off | January 3rd, 2018

Playground to Politics

Posted by Armando Brito Mendes | Filed under data sets, estatística, materiais para profissionais

clique no ícon para seguir o link

Dados de um questionário a 50 professores londrinos.

A study of values and attitudes among fifth formers in a North London comprehensive school.

This survey of teenage attitudes and opinions in a North London comprehensive school (11-18 mixed) was designed and conducted, under my guidance and supervision, by three of my sophomore students as part of their group research dissertation for BA Applied Social Studies (Social Research) at the Polytechnic of North London (PNL, now part of London Metropolitan University). . It aimed to discover something about pupils’ future expectations and awareness of, and attitudes towards, various current social issues and problems, particularly racism and sexism. It replicates various items and scales from other work (Wilson-Patterson, Eysenck, Himmelweit, Srole-Christie) particularly the St Paul’s Girls senior pupils study (Feb 1973) some of which were also used in the SSRC Survey Unit Quality of Life surveys 1971-75.

The self-completion questionnaire was completed in December 1981 by all fifth form pupils present on the day of the survey (N=142). It was administered during time-tabled Social Studies classes and, time permitting, was followed by discussion with class teachers and the PNL students of the issues covered in the survey.

Given the particularly high quality of this project, a user manual was prepared by John Hall and Alison Walker for use with the postgraduate Survey Analysis Workshop and the undergraduate course Data Management and Analysis. It serves as model documentation for similar small survey projects.

Tags: inquéritos

Read more | Comments off | July 20th, 2017

curso de KNIME

Posted by Armando Brito Mendes | Filed under mapas SIG's, materiais para profissionais, software, videos, visualização

clicar na imagem para seguir o link

Muito bom curso de KNIME, é introdutório mas introduz um grande número de funcionalidades.

KNIME Online Self-Training

Welcome to the KNIME Self-training course. The focus of this document is to get you started with KNIME as quickly as possible and guide you through essential steps of advanced analytics with KNIME. Optional and very useful topics such as reporting, KNIME Server and database handling are also included to give you an idea of what else is possible with KNIME.

Tags: análise de dados, big data, data mining, Knime, text mining

Read more | Comments off | December 16th, 2016

MARS – Multivariate Adaptive Regression Splines

Posted by Armando Brito Mendes | Filed under materiais ensino, materiais para profissionais

clique na imagem para seguir o link

Boa descrição destes algoritmos de análise de dados pelos proprios autores

An Overview of MARS

What is “MARS”?

MARS®, an acronym for Multivariate Adaptive Regression Splines, is a multivariate non-parametric regression procedure introduced in 1991 by world-renowned Stanford statistician and physicist, Jerome Friedman (Friedman, 1991). Salford Systems’ MARS, based on the original code, has been substantially enhanced with new features and capabilities in exclusive collaboration with Friedman.

Tags: análise de dados, data mining, machine learning

Read more | Comments off | September 23rd, 2016

How to create a slicer in Excel

Posted by Armando Brito Mendes | Filed under lições, materiais ensino, materiais para profissionais, software

clicar para seguir o link

Bom tutorial de como usar umas das novas funcionalidades do Excel

For dashboards and quick filtering, you can’t beat Excel slicers. They’re easy to implement and even easier to use. Here are the basics–plus a few power tips.

Tags: Excel

Read more | Comments off | July 20th, 2016

SAP video analytics

Posted by Armando Brito Mendes | Filed under materiais para profissionais, videos

clicar para seguir o link

montes de vídeos sobre analytics da SAP

Digital Enterprise Platform

Analytics (28)
Cloud Platform (4)
Internet of Things (7)
SAP HANA Platform (12)
User Experience (2)

SAP Digital Business Services

SAPIndustry

SAPLineOfBusiness

SME Solutions and Partner Innovation

Tags: análise de dados, data mining

Read more | Comments off | July 20th, 2016

MySQL Documentation

Posted by Armando Brito Mendes | Filed under linguagens de programação, materiais para profissionais, software

clique na imagem para seguir o link

Montes de documentação sobre todos os produtos MySQL

MySQL Server

MySQL 5.7 Reference Manual (GA)

MySQL 5.6 Reference Manual (GA)

MySQL 5.6 Reference Manual (GA) (Japanese)

MySQL 5.5 Reference Manual (GA)

MySQL Enterprise

MySQL Enterprise Monitor 3.2

MySQL Enterprise Monitor 3.1

MySQL Enterprise Monitor 3.0

MySQL Enterprise Monitor 3.0 (Japanese)

Oracle Enterprise Manager for MySQL Database

MySQL Enterprise Backup 4.0

MySQL Enterprise Backup 3.12

MySQL Enterprise Backup 3.11

MySQL Enterprise Backup 3.11 (Japanese)

MySQL Enterprise Security

MySQL Enterprise Encryption

MySQL Enterprise Audit

MySQL Enterprise Firewall

MySQL Thread Pool

MySQL Cluster

MySQL Cluster NDB 7.3/7.4 Reference Guide (GA)

MySQL Cluster NDB 7.2 Reference Guide (GA)

MySQL Cluster NDB 6.3/7.0/7.1 Reference Guide (GA)

MySQL Cluster API Developer Guide

memcache and MySQL Cluster

MySQL Cluster Manager 1.4

MySQL Cluster Manager 1.3

MySQL Workbench

Connectors & APIs

Connectors and APIs

Connector/J 5.1 (GA)

Connector/J 6.0 (Milestone)

MySQL for Visual Studio

MySQL Utilities / Fabric

MySQL Router

X DevAPI

X DevAPI User Guide

MySQL Connector/J X DevAPI Reference

MySQL Connector/Net X DevAPI Reference

MySQL Connector/Node.js X DevAPI Reference

MySQL Connector/Python X DevAPI Reference

MySQL Shell X DevAPI Reference

Release Notes

MySQL 5.7 Release Notes

MySQL 5.6 Release Notes

MySQL 5.5 Release Notes

MySQL Cluster 7.4 Release Notes

MySQL Cluster 7.3 Release Notes

MySQL Cluster 7.2 Release Notes

MySQL Cluster 7.1 Release Notes

MySQL Cluster Manager 1.4 Release Notes

MySQL Cluster Manager 1.3 Release Notes

MySQL Enterprise Monitor 3.2 Release Notes

MySQL Enterprise Monitor 3.1 Release Notes

MySQL Enterprise Monitor 3.0 Release Notes

Oracle Enterprise Manager for MySQL Database Release Notes

MySQL Enterprise Backup 4.0 Release Notes

MySQL Enterprise Backup 3.12 Release Notes

MySQL Enterprise Backup 3.11 Release Notes

MySQL Shell Release Notes

Connector/J 5.1 Release Notes

Connector/J 6.0 Release Notes

Connector/ODBC Release Notes

Connector/Net Release Notes

Connector/Node.js Release Notes

Connector/Python Release Notes

Connector/C Release Notes

Connector/C++ Release Notes

MySQL Installer Release Notes

MySQL Notifier Release Notes

MySQL for Excel Release Notes

MySQL for Visual Studio Release Notes

MySQL Workbench Release Notes

MySQL Router Release Notes

MySQL Fabric Release Notes

MySQL Utilities Release Notes

Expert Guides

MySQL Internals

MySQL Test Framework 2.0

Guardar

Tags: SQL

Read more | Comments off | July 20th, 2016

Deeplearning4j Documentation

Posted by Armando Brito Mendes | Filed under materiais para profissionais, software

clique na imagem para seguir o link

O site de um pacote java para deeplearing com montes de info. sobre redes neuronais e afins.

Guardar

Tags: análise de dados, big data, data mining, desnvolvimento de software, machine learning

Read more | Comments off | July 14th, 2016

The Many Faces of ROC Analysis

Posted by Armando Brito Mendes | Filed under materiais para profissionais

clicar na imagem para seguir o link

Bom tutorial sobre curvas ROC

Receiver Operating Characteristics (ROC) Analysis originated from signal detection theory, as a model of how well a receiver is able to detect a signal in the presence of noise. Its key feature is the distinction between hit rate (or true positive rate) and false alarm rate (or false positive rate) as two separate performance measures. ROC analysis has also widely been used in medical data analysis to study the effect of varying the threshold on the numerical outcome of a diagnostic test. It has been introduced to machine learning relatively recently, in response to classification tasks with varying class distributions or misclassification costs (hereafter referred to as skew). ROC analysis is set to cause a paradigm shift in machine learning. Separating performance on classes is almost always a good idea from an analytical perspective. For instance, it can help us to

understand the behaviour and skew-sensitivity of many machine learning metrics, including rule learning heuristics and decision tree splitting criteria, by plotting their isometrics in ROC space;
develop new metrics specifically designed to improve the Area Under the ROC Curve (AUC) of a model;
understand fundamental algorithms such as the separate-and-conquer or sequential covering rule learning algorithm, by tracing its trajectory through a sequence of ROC spaces.

The goal of this tutorial is to develop the ROC perspective in a systematic way, demonstrating the many faces of ROC analysis in machine learning.

Tags: data mining, DW \ BI, machine learning

Read more | Comments off | July 12th, 2016

« Older Entries

Newer Entries »

Armando B. Mendes

How and Why: Decorrelate Time Series

The 7 Most Important Data Mining Techniques

Playground to Politics

curso de KNIME

KNIME Online Self-Training

MARS – Multivariate Adaptive Regression Splines

Boa descrição destes algoritmos de análise de dados pelos proprios autores

An Overview of MARS

How to create a slicer in Excel

SAP video analytics

MySQL Documentation

Deeplearning4j Documentation

The Many Faces of ROC Analysis

Categorias de Posts

Palavras chave mais usadas

Arquivo

Recent Posts

Recent Comments

About