How We Combined Different Methods to Create Time Series Prediction
Posted by Armando Brito Mendes | Filed under estatística
Bom texto sobre a decomposição clássica para previsão.
Today, businesses need to be able to predict demand and trends to stay in line with any sudden market changes and economy swings. This is exactly where forecasting tools, powered by Data Science, come into play, enabling organizations to successfully deal with strategic and capacity planning. Smart forecasting techniques can be used to reduce any possible risks and assist in making well-informed decisions. One of our customers, an enterprise from the Middle East, needed to predict their market demand for the upcoming twelve weeks. They required a market forecast to help them set their short-term objectives, such as production strategy, as well as assist in capacity planning and price control. So, we came up with an idea of creating a custom time series model capable of tackling the challenge. In this article, we will cover the modelling process as well as the pitfalls we had to overcome along the way.
Tags: previsão
why predicting a mass shooting is impossible
Posted by Armando Brito Mendes | Filed under estatística
Excelente exemplo na previsão de acontecimentos raros
This cartoon explains why predicting a mass shooting is impossible
It’s simple math.
Updated by Brian Resnick and Javier Zarracina on June 15, 2016, 10:30 a.m. ET
In the wake of mass shootings, many people wonder how they could have been prevented. Were there warning signs that should have been heeded? Was the person mentally ill? Did he or she hold extremist views?
The sad truth is that the only personal factors that reliably correlate with mass shooters are being young and male. There are a lot of young, angsty men in this country. That makes prediction hard.
Guardar
Tags: data mining, previsão
A New View of Statistics
Posted by Armando Brito Mendes | Filed under estatística, lições, materiais ensino
Um webBook com montes de temas bem explicados
Mar 2013. Coming very soon: a slideshow and Excel workbook for an introductory course of 10 lectures on statistics. Aug 2011. Check out the following 2010 articles at Sportscience: assigning subjects to treatments in a controlled trial; regression vs limits of agreement in measure-comparison studies; magnitudes of effects derived from linear models. See the frame at right for links to much more, including the progressive statistics and research design articles. Previous updates…
New original approaches to statistics for researchers: the examples are taken from exercise and sport science, but the principles apply to all empirical sciences. Read more in the preface.
Feedback wanted: if you can’t understand something here, it’s my fault. Email me.
Become a license holder…eventually! Not yet. More…
Full Contents
Short Contents:
Preface: About These Pages
Summarizing Data
Simple Statistics & Effect Statistics
Dimension Reduction
Precision of Measurement
Generalizing to a Population
Confidence Limits & Statistical Significance
Statistical Models
Estimating Sample Size
Summary: The Most Important Points
Quiz
Reference: Hopkins, W. G. (2000). A new view of statistics. Internet Society for Sport Science: http://www.sportsci.org/resource/stats/.
Tags: análise de dados, Estat Descritiva, inferência, previsão
A Programmer’s Guide to Data Mining
Posted by Armando Brito Mendes | Filed under estatística, materiais para profissionais
A guide to practical data mining, collective intelligence, and building recommendation systems by Ron Zacharski.
About This Book
Before you is a tool for learning basic data mining techniques. Most data mining textbooks focus on providing a theoretical foundation for data mining, and as result, may seem notoriously difficult to understand. Don’t get me wrong, the information in those books is extremely important. However, if you are a programmer interested in learning a bit about data mining you might be interested in a beginner’s hands-on guide as a first step. That’s what this book provides.
This guide follows a learn-by-doing approach. Instead of passively reading the book, I encourage you to work through the exercises and experiment with the Python code I provide. I hope you will be actively involved in trying out and programming data mining techniques. The textbook is laid out as a series of small steps that build on each other until, by the time you complete the book, you have laid the foundation for understanding data mining techniques. This book is available for download for free under a Creative Commons license (see link in footer). You are free to share the book, and remix it. Someday I may offer a paper copy, but the online version will always be free.
Table of Contents
This book’s contents are freely available as PDF files. When you click on a chapter title below, you will be taken to a webpage for that chapter. The page contains links for a PDF of that chapter and for any sample Python code and data that chapter requires. Please let me know if you see an error in the book, if some part of the book is confusing, or if you have some other comment. I will use these to revise the chapters.
Chapter 1: Introduction
Finding out what data mining is and what problems it solves. What will you be able to do when you finish this book.
Chapter 2: Get Started with Recommendation Systems
Introduction to social filtering. Basic distance measures including Manhattan distance, Euclidean distance, and Minkowski distance. Pearson Correlation Coefficient. Implementing a basic algorithm in Python.
Chapter 3: Implicit ratings and item-based filtering
A discussion of the types of user ratings we can use. Users can explicitly give ratings (thumbs up, thumbs down, 5 stars, or whatever) or they can rate products implicitly–if they buy an mp3 from Amazon, we can view that purchase as a ‘like’ rating.
Chapter 4: Classification
In previous chapters we used people’s ratings of products to make recommendations. Now we turn to using attributes of the products themselves to make recommendations. This approach is used by Pandora among others.
Chapter 5: Further Explorations in Classification
A discussion on how to evaluate classifiers including 10-fold cross-validation, leave-one-out, and the Kappa statistic. The k Nearest Neighbor algorithm is also introduced.
Chapter 6: Naïve Bayes
An exploration of Naïve Bayes classification methods. Dealing with numerical data using probability density functions.
Chapter 7: Naïve Bayes and unstructured text
This chapter explores how we can use Naïve Bayes to classify unstructured text. Can we classify twitter posts about a movie as to whether the post was a positive review or a negative one?
Chapter 8: Clustering
Clustering – both hierarchical and kmeans clustering.
Tags: data mining, previsão
Better data centers through machine learning
Posted by Armando Brito Mendes | Filed under materiais para profissionais
It’s no secret that we’re obsessed with saving energy. For over a decade we’ve been designing and building data centers that use half the energy of a typical data center, and we’re always looking for ways to reduce our energy use even further. In our pursuit of extreme efficiency, we’ve hit upon a new tool: machine learning. Today we’re releasing a white paper (PDF) on how we’re using neural networks to optimize data center operations and drive our energy use to new lows.
Tags: análise de dados, data mining, previsão
IFORS Education Resources Project
Posted by Armando Brito Mendes | Filed under estatística, Investigação Operacional, SAD - DSS, software
Welcome to the International Federation of Operational Research Societies (IFORS) Education Resources Project
- Main Page (19:13, 3 December 2013)
- Biased Random-Key Genetic Algorithms: A Tutorial (21:57, 2 December 2013)
- The Discrete Event System Specification Formalism (19:59, 2 December 2013)
- Urban Operations Research (01:36, 2 December 2013)
- Stochastic Models for Design and Planning (01:34, 2 December 2013)
- Queueing Theory Books Online (01:31, 2 December 2013)
- Practical Queueing Theory in Java (01:31, 2 December 2013)
- Explore Queueing Theory for Scheduling, Resource Allocation and Traffic Flow Applications (01:28, 2 December 2013)
- Stochastic Processes Course Notes (01:26, 2 December 2013)
- Test Problems for Non-Linear Programming (01:23, 2 December 2013)
- OR Notes: Separable Programming (01:21, 2 December 2013)
- OR Notes: Non-Linear Programming (01:20, 2 December 2013)
Tags: decisao em grupo, decisão médica, otimização, previsão, problemas, programação em folha de cálculo, software de otimização, software estatístico
Little Book of R for Time Series!
Posted by Armando Brito Mendes | Filed under estatística, software
- How to install R
- Using R for Time Series Analysis
Tags: previsão, R-software
SPSS Programs for Analyzing Lag-Sequential Categorical Data
Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, software
This paper describes simple and flexible programs for conducting lag sequential event analyses using SAS and SPSS. The programs read a stream of codes and produce a variety of lag sequential statistics, including transitional frequencies, expected transitional frequencies, transitional probabilities, z values, adjusted residuals, Yule’s Q values, likelihood ratio tests of stationarity across time and homogeneity across groups or segments, transformed kappas for unidirectional dependence, bidirectional dependence, parallel and nonparallel dominance, and significance levels based on both parametric and randomization tests.
Tags: captura de conhecimento, data mining, desnvolvimento de software, IBM SPSS Statistics, previsão, qualidade, software estatístico
SPSS Web Book Regression with SPSS
Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, software
by Xiao Chen, Phil Ender, Michael Mitchell and Christine Wells (in alphabetical order) The aim of these materials is to help you increase your skills in using regression analysis with SPSS. This web book does not teach regression, per se, but focuses on how to perform regression analyses using SPSS. It is assumed that you have had at least a one quarter/semester course in regression (linear models) or a general statistical methods course that covers simple and multiple regression and have access to a regression textbook that explains the theoretical background of the materials covered in these chapters.
Tags: análise de dados, data mining, IBM SPSS Statistics, inferência, previsão, software estatístico
IBM SPSS product portfolio
Posted by Armando Brito Mendes | Filed under estatística, software
Why SPSS software?
With SPSS predictive analytics software, you can predict with confidence what will happen next so that you can make smarter decisions, solve problems and improve outcomes.
Tags: análise de dados, Estat Descritiva, IBM SPSS Statistics, inferência, previsão, software estatístico