50 Great Examples of Data Visualization

clique na imagem para seguir o linkBons exemplos de representações gráficas

Wrapping your brain around data online can be challenging, especially when dealing with huge volumes of information.

And trying to find related content can also be difficult, depending on what data you’re looking for.

But data visualizations can make all of that much easier, allowing you to see the concepts that you’re learning about in a more interesting, and often more useful manner.

Below are 50 of the best data visualizations and tools for creating your own visualizations out there, covering everything from Digg activity to network connectivity to what’s currently happening on Twitter.

Tags: ,

SQL Server Data Mining News

clique na imagem para seguir o link

clique na imagem para seguir o link

Um site com visão da microsoft para o data mining

Welcome to SQLServerDataMining.com

This site has been designed by the SQL Server Data Mining team to provide the SQL Server community with access to and information about our in-database data mining and analytics features.  SQL Server 2000 was the first major database release to put analytics in the database.  Catch up with the latest SQL Server Data Mining news in our newsletter.

SQL Server 2012 SP1 Data Mining Add-ins for Office (with 32-bit or 64-bit Support)

The Data Mining Add-ins allow you to harness the power of SQL Server 2012 predictive analytics in Excel and Visio and they have been updated to include 32-bit or 64-bit support for Office 2010 or Office 2013. Use Table Analysis Tools to get insight with a couple of clicks. Use the Data Mining tab for full-lifecycle data mining, and build models which can be exported to a production server.  Visualize your models in Visio.

SQL Server 2012 Data Mining

Microsoft expert Rafal Lukawiecki provides free and paid videos on data mining for SQL Server 2012 at Project Botticelli. The website has other Microsoft BI topics too from leading Microsoft experts.

SQL Server DM with Excel 2010 and PowerPivot

Microsoft MVP Mark Tabladillo shows you how to unleash SQL Server 2008 Data Mining with Excel 2010 and SQL Server PowerPivot for Excel, Microsoft’s new self-service BI offering.

Tags: , ,

When Variable Reduction Doesn’t Work

clique na imagem para seguir o link

clique na imagem para seguir o link

Um bom exemplo de como os procedimentos habituais nem sempre funcionam

Summary: Exceptions sometimes make the best rules.  Here’s an example of well accepted variable reduction techniques resulting in an inferior model and a case for dramatically expanding the number of variables we start with.

of the things that keeps us data scientists on our toes is that the well-established rules-of-thumb don’t always work.  Certainly one of the most well-worn of these rules is the parsimonious model; always seek to create the best model with the fewest variables.  And woe to you who violate this rule.  Your model will over fit, include false random correlations, or at very least will just be judged to be slow and clunky.

Certainly this is a rule I embrace when building models so I was surprised and then delighted to find a well conducted study by Lexis/Nexis that lays out a case where this clearly isn’t true.

Tags:

How signal processing can be used to identify patterns in complex time series

clique na imagem para seguir o link

clique na imagem para seguir o link

Uso de técnicas de processamento de sinal em séries cronológicas

The trend and seasonality can be accounted for in a linear model by including sinusoidal components with a given frequency. However, finding the appropriate frequency for each sinusoidal component requires a little more digging. This post shows how to use fast Fourier transforms to find these frequencies.


Tags: ,

How To Forecast Time Series Data With Multiple Seasonal Periods

clique na imagem para seguir o link

clique na imagem para seguir o link

Análise de séries complexas com múltiplos períodos sazonais

Time series data is produced in domains such as IT operations, manufacturing, and telecommunications. Examples of time series data include the number of client logins to a website on a daily basis, cell phone traffic collected per minute, and temperature variation in a region by the hour. Forecasting a time series signal ahead of time helps us make decisions such as planning capacity and estimating demand. Previous time series analysis blog posts focused on processing time series data that resides on Greenplum database using SQL functions. In this post, I will examine the modeling steps involved in forecasting a time series sequence with multiple seasonal periods. The various steps involved are outlined below:

  • Multiple seasonality is modelled with the help of fourier series with different periods
  • External regressors in the form of fourier terms are added to an ARIMA model to account for the seasonal behavior
  • Akaike Information Criteria (AIC) is used to find the best fit model

Tags:

Exponential Smoothing of Time Series Data in R

clique na imagem para seguir o link

clique na imagem para seguir o link

Alisamento exponencial com o pacote expsmooth do R

This article is not about smoothing ore into gems though your may find a few gems herein.

Systematic Pattern and Random Noise

In “Components of Time Series Data”, I discussed the components of time series data. In time series analysis, we assume that the data consist of a systematic pattern (usually a set of identifiable components) and random noise (error), which often makes the pattern difficult to identify. Most time series analysis techniques involve some form of filtering out noise to make the pattern more noticeable.

How To Use Multivariate Time Series Techniques For Capacity Planning on VMs

clique na imagem para seguir o link

clique na imagem para seguir o link

Métodos multivariados para séries cronológicas com VMs

Capacity planning is an arduous, ongoing task for many operations teams, especially for those who rely on Virtual Machines (VMs) to power their business. At Pivotal, we have developed a data science model capable of forecasting hundreds of thousands of models to automate this task using a multivariate time series approach. Open to reuse for other areas such as industrial equipment or vehicles engines, this technique can be applied broadly to anything where regular monitoring data can be collected.


Tags: ,

Three classes of metrics: centrality, volatility, and bumpiness

clique na imagem para seguir o link

clique na imagem para seguir o link

introduz uma nova classe de estatísticas para séries cronológicas: bumpiness

All statistical textbooks focus on centrality (median, average or mean) and volatility (variance). None mention the third fundamental class of metrics: bumpiness.

Here we introduce the concept of bumpiness and show how it can be used. Two different datasets can have same mean and variance, but a different bumpiness. Bumpiness is linked to how the data points are ordered, while centrality and volatility completely ignore order. So, bumpiness is useful for datasets where order matters, in particular time series. Also, bumpiness integrates the notion of dependence (among the data points), while centrality and variance do not. Note that a time series can have high volatility (high variance) and low bumpiness. The converse is true.

The attached Excel spreadsheet shows computations of the bumpiness coefficient r for various time series. It is also of interest to readers who wish to learn new Excel concepts such a random number generation with Rand, indirect references with Indirect, Rank, Large and other powerful but not well known Excel functions. It is also an example of a fully interactive Excel spreadsheet driven by two core parameters.

Finally, this article shows (1) how a new concept is thought of, (2) then a robust, modern definition materialized, and (3) eventually a more meaningful definition created based on, and compatible with previous science.

Tags:

Recurrent neural networks, Time series data and IoT – Part One

clique na imagem para seguir o link

clique na imagem para seguir o link

Utilização de redes neuronais para previsão de séries univariadas

RNNs are already used for Time series analysis. Because IoT problems can often be modelled as a Time series, RNNs could apply to IoT data. In this multi-part blog, we first discuss Time series applications and then discuss how RNNs could apply to Time series applications. Finally, we discuss applicability to IoT.

In this article (Part One), we present the overall thought process behind the use of Recurrent neural networks and Time series applications – especially a type of RNN called Long Short Term Memory networks (LSTMs).

Tags: ,

Time Series Analysis using R-Forecast package

clique na imagem para seguir o link

clique na imagem para seguir o link

Demonstra algumas das funcionalidades do pacote R forecast

In today’s blog post, we shall look into time series analysis using R package – forecast. Objective of the post will be explaining the different methods available in forecast package which can be applied while dealing with time series analysis/forecasting.

Tags: ,