MARS – Multivariate Adaptive Regression Splines
Posted by Armando Brito Mendes | Filed under materiais ensino, materiais para profissionais
Boa descrição destes algoritmos de análise de dados pelos proprios autores
An Overview of MARS
What is “MARS”?
MARS®, an acronym for Multivariate Adaptive Regression Splines, is a multivariate non-parametric regression procedure introduced in 1991 by world-renowned Stanford statistician and physicist, Jerome Friedman (Friedman, 1991). Salford Systems’ MARS, based on the original code, has been substantially enhanced with new features and capabilities in exclusive collaboration with Friedman.
Tags: análise de dados, data mining, machine learning
Tinker With a Neural Network
Posted by Armando Brito Mendes | Filed under software, visualização
Uma excelente aplicação web para perceber como as redes neuronais funcionam
Um, What Is a Neural Network?
It’s a technique for building a computer program that learns from data. It is based very loosely on how we think the human brain works. First, a collection of software “neurons” are created and connected together, allowing them to send messages to each other. Next, the network is asked to solve a problem, which it attempts to do over and over, each time strengthening the connections that lead to success and diminishing those that lead to failure. For a more detailed introduction to neural networks, Michael Nielsen’s Neural Networks and Deep Learning is a good place to start. For a more technical overview, try Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.
This Is Cool, Can I Repurpose It?
Please do! We’ve open sourced it on GitHub with the hope that it can make neural networks a little more accessible and easier to learn. You’re free to use it in any way that follows our Apache License. And if you have any suggestions for additions or changes, please let us know.
We’ve also provided some controls below to enable you tailor the playground to a specific topic or lesson. Just choose which features you’d like to be visible below then save this link, or refresh the page.
What Do All the Colors Mean?
Orange and blue are used throughout the visualization in slightly different ways, but in general orange shows negative values while blue shows positive values.
The data points (represented by small circles) are initially colored orange or blue, which correspond to positive one and negative one.
In the hidden layers, the lines are colored by the weights of the connections between neurons. Blue shows a positive weight, which means the network is using that output of the neuron as given. An orange line shows that the network is assiging a negative weight.
In the output layer, the dots are colored orange or blue depending on their original values. The background color shows what the network is predicting for a particular area. The intensity of the color shows how confident that prediction is.
What Library Are You Using?
We wrote a tiny neural network library that meets the demands of this educational visualization. For real-world applications, consider the TensorFlow library.
Credits
This was created by Daniel Smilkov and Shan Carter. This is a continuation of many people’s previous work — most notably Andrej Karpathy’s convnet.js demo and Chris Olah’s articles about neural networks. Many thanks also to D. Sculley for help with the original idea and to Fernanda Viégas and Martin Wattenberg and the rest of the Big Picture and Google Brain teams for feedback and guidance.
Tags: data mining, machine learning, web apps
Os portugueses durante o euro com dados do multibanco
Posted by Armando Brito Mendes | Filed under estatística, visualização
Um bom exemplo da utilização de dados para inferir comportamentos mas a parte das coincidências de valores era dispensável
Como conquistámos o Euro 2016 através do Multibanco (com infografia)
Publicado em: 20/07/2016 – 19:11:26
À hora da final entre Portugal e França, o país parou… e os levantamentos também! Conheça esta e outras curiosidades que marcaram o comportamento dos portugueses com a rede Multibanco à medida que os 23 magníficos conquistavam o Europeu 2016
Guardar
Guardar
Tags: belo, big data, data mining, DW \ BI
How to create a slicer in Excel
Posted by Armando Brito Mendes | Filed under lições, materiais ensino, materiais para profissionais, software
Bom tutorial de como usar umas das novas funcionalidades do Excel
For dashboards and quick filtering, you can’t beat Excel slicers. They’re easy to implement and even easier to use. Here are the basics–plus a few power tips.
Tags: Excel
SAP video analytics
Posted by Armando Brito Mendes | Filed under materiais para profissionais, videos
SME Solutions and Partner Innovation
Tags: análise de dados, data mining
MySQL Documentation
Posted by Armando Brito Mendes | Filed under linguagens de programação, materiais para profissionais, software
Montes de documentação sobre todos os produtos MySQL
Guardar
Tags: SQL
Hackers Remotely Kill a Jeep on the Highway
Posted by Armando Brito Mendes | Filed under videos
Um exemplo dos problemas de segunrança ainda existentes no IoT.
Two hackers have developed a tool that can hijack a Jeep over the internet. WIRED senior writer Andy Greenberg takes the SUV for a spin on the highway while the hackers attack it from miles away.
Guardar
Tags: big data, data mining
Deeplearning4j Documentation
Posted by Armando Brito Mendes | Filed under materiais para profissionais, software
O site de um pacote java para deeplearing com montes de info. sobre redes neuronais e afins.
- How To
- Quickstart: Running Examples and DL4J in Your Projects
- Comprehensive Setup Guide
- Build Locally From Master
- Contribute to DL4J (Developer Guide)
- Choose a Neural Net
- Use the Maven Build Tool
- Vectorize Data With Canova
- Build a Data Pipeline
- Run Benchmarks
- Configure DL4J in Ivy, Gradle, SBT etc
- Find a DL4J Class or Method
- Save and Load Models
- Interpret Neural Net Output
- Visualize Data with t-SNE
- Swap CPUs for GPUs
- Customize an Image Pipeline
- Perform Regression With Neural Nets
- Troubleshoot Training & Select Network Hyperparameters
- Visualize, Monitor and Debug Network Learning
- Speed Up Spark With Native Binaries
- Build a Recommendation Engine With DL4J
- Use Recurrent Networks in DL4J
- Build Complex Network Architectures with Computation Graph
- Train Networks using Early Stopping
- Download Snapshots With Maven
- Customize a Loss Function
- Introduction to Neural Networks
- Multilayer Neural Nets
- Tutorials
- Datasets
- Scaleout
- Text
- Resources
- DL4J, Torch7, Theano and Caffe
- Glossary of Terms for Deep Learning and Neural Nets
- Deep Learning’s Accuracy
- DataVec: ETL for ML
- ND4J Backends: How They Work
- Model Zoo
- Unsupervised Learning: Use Cases
- Eigenvectors, PCA, Covariance and Entropy
- Thought Vectors, AI and NLP
- Questions to Ask When Applying DL
- AI, Machine Learning and Deep Learning
- DL and Reinforcement Learning
- Javadoc: DL4J Methods and Classes
- Canova Javadoc: Canova Methods and Classes
- ND4J User Guide
- ND4J Javadoc
- Scala, Spark and Deep Learning
- Further Reading on Deep Learning
- Deep Learning in Other Languages
- Use Cases
- Architecture
- Features
- Roadmap
- About
- Open Data
- Latest Release Notes
Guardar
Tags: análise de dados, big data, data mining, desnvolvimento de software, machine learning
The Many Faces of ROC Analysis
Posted by Armando Brito Mendes | Filed under materiais para profissionais
Bom tutorial sobre curvas ROC
Receiver Operating Characteristics (ROC) Analysis originated from signal detection theory, as a model of how well a receiver is able to detect a signal in the presence of noise. Its key feature is the distinction between hit rate (or true positive rate) and false alarm rate (or false positive rate) as two separate performance measures. ROC analysis has also widely been used in medical data analysis to study the effect of varying the threshold on the numerical outcome of a diagnostic test. It has been introduced to machine learning relatively recently, in response to classification tasks with varying class distributions or misclassification costs (hereafter referred to as skew). ROC analysis is set to cause a paradigm shift in machine learning. Separating performance on classes is almost always a good idea from an analytical perspective. For instance, it can help us to
- understand the behaviour and skew-sensitivity of many machine learning metrics, including rule learning heuristics and decision tree splitting criteria, by plotting their isometrics in ROC space;
- develop new metrics specifically designed to improve the Area Under the ROC Curve (AUC) of a model;
- understand fundamental algorithms such as the separate-and-conquer or sequential covering rule learning algorithm, by tracing its trajectory through a sequence of ROC spaces.
The goal of this tutorial is to develop the ROC perspective in a systematic way, demonstrating the many faces of ROC analysis in machine learning.
Tags: data mining, DW \ BI, machine learning
C Tutorial
Posted by Armando Brito Mendes | Filed under linguagens de programação
Bom tutorial de C on-line.
Learn C with our popular C tutorial, which will take you from the very basics of C all the way through sophisticated topics like binary trees and data structures. By the way, if you’re on the fence about learning C or C++, I recommend going through the C++ tutorial instead as it is a more modern language.
Introduction and Basic C Features
Pointers, Arrays and Strings
File IO and command line arguments
Linked lists, binary trees, recursion
Finished with all these tutorials? Do some practice problems or view more tutorials.