SQL Server Data Mining News

Março 5th, 2018 por Armando Brito Mendes

Um site com visão da microsoft para o data mining
Welcome to SQLServerDataMining.com
This site has been designed by the SQL Server Data Mining team to provide the SQL Server community with access to and information about our in-database data mining and analytics features.  SQL Server 2000 was the first major database release to put analytics [...]

When Variable Reduction Doesn’t Work

Janeiro 31st, 2018 por Armando Brito Mendes

Um bom exemplo de como os procedimentos habituais nem sempre funcionam
Summary: Exceptions sometimes make the best rules.  Here’s an example of well accepted variable reduction techniques resulting in an inferior model and a case for dramatically expanding the number of variables we start with.

of the things that keeps us data scientists on our [...]

How signal processing can be used to identify patterns in complex time series

Janeiro 31st, 2018 por Armando Brito Mendes

Uso de técnicas de processamento de sinal em séries cronológicas
The trend and seasonality can be accounted for in a linear model by including sinusoidal components with a given frequency. However, finding the appropriate frequency for each sinusoidal component requires a little more digging. This post shows how to use fast Fourier transforms to find these [...]

How To Forecast Time Series Data With Multiple Seasonal Periods

Janeiro 31st, 2018 por Armando Brito Mendes

Análise de séries complexas com múltiplos períodos sazonais
Time series data is produced in domains such as IT operations, manufacturing, and telecommunications. Examples of time series data include the number of client logins to a website on a daily basis, cell phone traffic collected per minute, and temperature variation in a region by the hour. Forecasting [...]

Exponential Smoothing of Time Series Data in R

Janeiro 31st, 2018 por Armando Brito Mendes

Alisamento exponencial com o pacote expsmooth do R

This article is not about smoothing ore into gems though your may find a few gems herein.
Systematic Pattern and Random Noise

In “Components of Time Series Data”, I discussed the components of time series data. In time series analysis, we assume that the data consist of a systematic [...]

How To Use Multivariate Time Series Techniques For Capacity Planning on VMs

Janeiro 31st, 2018 por Armando Brito Mendes

Métodos multivariados para séries cronológicas com VMs
Capacity planning is an arduous, ongoing task for many operations teams, especially for those who rely on Virtual Machines (VMs) to power their business. At Pivotal, we have developed a data science model capable of forecasting hundreds of thousands of models to automate this task using a multivariate time [...]

Three classes of metrics: centrality, volatility, and bumpiness

Janeiro 31st, 2018 por Armando Brito Mendes

introduz uma nova classe de estatísticas para séries cronológicas: bumpiness
All statistical textbooks focus on centrality (median, average or mean) and volatility (variance). None mention the third fundamental class of metrics: bumpiness.

Here we introduce the concept of bumpiness and show how it can be used. Two different datasets can have same mean and variance, but a [...]

Recurrent neural networks, Time series data and IoT – Part One

Janeiro 31st, 2018 por Armando Brito Mendes

Utilização de redes neuronais para previsão de séries univariadas
RNNs are already used for Time series analysis. Because IoT problems can often be modelled as a Time series, RNNs could apply to IoT data. In this multi-part blog, we first discuss Time series applications and then discuss how RNNs could apply to Time series applications. Finally, [...]

Time Series Analysis using R-Forecast package

Janeiro 31st, 2018 por Armando Brito Mendes

Demonstra algumas das funcionalidades do pacote R forecast
In today’s blog post, we shall look into time series analysis using R package – forecast. Objective of the post will be explaining the different methods available in forecast package which can be applied while dealing with time series analysis/forecasting.

Avoiding a common mistake with time series

Janeiro 31st, 2018 por Armando Brito Mendes

Um caso em q a tendência mascara o resto da série criando correlações elevadas
A basic mantra in statistics and data science is correlation is not causation, meaning that just because two things appear to be related to each other doesn’t mean that one causes the other. This is a lesson worth learning.

If you work with [...]