noticias, textos e tudo o mais sobre big data

muito material interessante sobre big data

muito material interessante sobre big data

Tags: , ,

Machine Learning

Um bom curso video de machine learning

Um bom curso video de machine learning

About the Course

Machine learning algorithms can figure out how to perform important tasks by generalizing from examples. This is often feasible and cost-effective when manual programming is not. Machine learning (also known as data mining, pattern recognition and predictive analytics) is used widely in business, industry, science and government, and  there is a great shortage of experts in it. If you pick up a machine learning textbook you may find it forbiddingly mathematical, but in this class you will learn that the key ideas and algorithms are in fact quite intuitive. And powerful!
Most of the class will be devoted to supervised learning (in other words, learning in which a teacher provides the learner with the correct answers at training time). This is the most mature and widely used type of machine learning. We will cover the main supervised learning techniques, including decision trees, rules, instances, Bayesian techniques, neural networks, model ensembles, and support vector machines. We will also touch on learning theory with an emphasis on its practical uses. Finally, we will cover the two main classes of unsupervised learning methods: clustering and dimensionality reduction. Throughout the class there will be an emphasis not just on individual algorithms but on ideas that cut across them and tips for making them work.
In the class projects you will build your own implementations of machine learning algorithms and apply them to problems like spam filtering, clickstream mining, recommender systems, and computational biology. This will get you as close to becoming a machine learning expert as you can in ten weeks!

Course Syllabus

Week One: Basic concepts in machine learning.
Week Two: Decision tree induction.
Week Three: Learning sets of rules and logic programs.
Week Four: Instance-based learning.
Week Five: Statistical learning.
Week Six: Neural networks.
Week Seven: Model ensembles.
Week Eight: Learning theory.
Week Nine: Support vector machines.
Week Ten: Clustering and dimensionality reduction.

Tags: ,

Machine Learning MOOC

Um curso muito completo de machine learning

Um curso muito completo de machine learning

About the Course

Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome. Machine learning is so pervasive today that you probably use it dozens of times a day without knowing it. Many researchers also think it is the best way to make progress towards human-level AI. In this class, you will learn about the most effective machine learning techniques, and gain practice implementing them and getting them to work for yourself. More importantly, you’ll learn about not only the theoretical underpinnings of learning, but also gain the practical know-how needed to quickly and powerfully apply these techniques to new problems. Finally, you’ll learn about some of Silicon Valley’s best practices in innovation as it pertains to machine learning and AI.

This course provides a broad introduction to machine learning, datamining, and statistical pattern recognition. Topics include: (i) Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks). (ii) Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning). (iii) Best practices in machine learning (bias/variance theory; innovation process in machine learning and AI). The course will also draw from numerous case studies and applications, so that you’ll also learn how to apply learning algorithms to building smart robots (perception, control), text understanding (web search, anti-spam), computer vision, medical informatics, audio, database mining, and other areas.

FAQ

  • What is the format of the class?The class will consist of lecture videos, which are broken into small chunks, usually between eight and twelve minutes each. Some of these may contain integrated quiz questions. There will also be standalone quizzes that are not part of video lectures, and programming assignments.
  • How much programming background is needed for the course?The course includes programming assignments and some programming background will be helpful.
  • Do I need to buy a textbook for the course?No, it is self-contained.
  • Will I get a statement of accomplishment after completing this class?Yes. Students who successfully complete the class will receive a statement of accomplishment signed by the instructor.

Tags: , , , ,

WEKA: Remote Experiment

permite computação distribuida usando um servidor com algoritmos WEKA

permite computação distribuída usando um servidor com algoritmos WEKA

Remote experiments enable you to distribute the computing load across multiple computers. In the following we will discuss the setup and operation for HSQLDB and MySQL.

Tags: , , ,

kaggle competitions

Um site para cientistas dos dados com desfios propostos por empresas

Um site para cientistas dos dados com desfios propostos por empresas

Welcome to Kaggle, the leading platform for predictive modeling competitions. Here’s how to jump into competing on Kaggle —
New to Data Science? Visit our Wiki »
Learn about hosting a competition »
in-Class & Research competitions »

Tags: , ,

Data Warehousing Review

Bom site com muitos conselhos úteis sobre DW e BI

Bom site com muitos conselhos úteis sobre DW e BI

Data Warehouses are increasingly used by enterprises to increase efficiency and competitiveness. Using Scorecarding, Data Mining and OLAP analysis, business value can be extracted from Data Warehouses.

Data Cleansing for Data Warehousing:  How important is Extract, Transform, Load (ETL) to data Warehousing?

Introduction to OLAP :  Slice, Dice and Drill!

Selecting an OLAP Application:  Minimizing risks in the product selection process

Planning for a Data Warehouse:  Starting a Data Warehousing Project? Three words – Plan, Plan and Plan!

Designing OLAP Solutions:  MOLAP, ROLAP, HOLAP and other acronyms!

Introduction to Metadata:  Case study of an implementation in the insurance industry

Tags: ,