introducing R to a non-programmer in one hour

Uma introdução muito rápida

Uma introdução muito rápida

Biostatistics PhD candidate Alyssa Frazee was tasked with teaching her sister, an undergraduate in sociology, how to use R. She had only one hour.

Once you load in a dataset, things start to get fun. We learned a whole bunch of stuff from this data frame, like how to do basic tabulations and calculate summary statistics, how to figure out if you have missing data, and how to fit a simple linear model. This part was pretty fun because my sister started leading the session: instead of me saying “I’m going to show you how to do this,” it was her asking “Hey, could we make a scatterplot?” or “Do you think we could put the best-fit line on that plot?” I was really glad this happened — I hope it meant she was engaged and enjoying herself!

This is the nice thing about R. There are so many built-in functions and packages that you can get something useful with a few lines of code, and you don’t really even have to know what a function is to get started (although you should eventually). Then you can go as far down the rabbit hole as you want.

Tags: , , , ,

Um bom texto sobre erros cometidos por profissionais no uso da estatística

Um bom texto sobre erros cometidos por profissionais no uso da estatística

Alex Reinhart, a PhD statistics student at Carnegie Mellon University, covers some of the common analysis mistakes in Statistics Done Wrong.

Statistics Done Wrong is a guide to the most popular statistical errors and slip-ups committed by scientists every day, in the lab and in peer-reviewed journals. Many of the errors are prevalent in vast swathes of the published literature, casting doubt on the findings of thousands of papers. Statistics Done Wrong assumes no prior knowledge of statistics, so you can read it before your first statistics course or after thirty years of scientific practice.

The text is available for free online, and there’s a physical book version on the way.

Tags: , , ,

4 Faces of Big Data

Bom texto sobre os vários aspetos a considerar no Big-Data

Bom texto sobre os vários aspetos a considerar no Big-Data

The 4 Faces of Big Data Challenges You just Can’t Ignore

Date: October 20, 2013 Author: Varoon Rajani
Business Decision makers everywhere yearn for the right information that would help them make informed decisions.

Tags: , ,

Data no visualizing.org

Blog com dados para trabalhos de visualização

Blog com dados para trabalhos de visualização

Connect with expert sources and join the discussion on Data Channels

Tags: , ,

Reddit Data Is Beautiful

Um blog sobre visualização e R

Um blog sobre visualização e R

Data is Beautiful

A place for visual representations of data: Graphs, charts, maps, etc.

Best of 2012 Results

Rules

Infographic vs. Visualization? Data from Star Trek? Data ARE? How do I make one? Read the FAQ

Related

Tags: , , , ,

WEKA: Remote Experiment

permite computação distribuida usando um servidor com algoritmos WEKA

permite computação distribuída usando um servidor com algoritmos WEKA

Remote experiments enable you to distribute the computing load across multiple computers. In the following we will discuss the setup and operation for HSQLDB and MySQL.

Tags: , , ,

kaggle competitions

Um site para cientistas dos dados com desfios propostos por empresas

Um site para cientistas dos dados com desfios propostos por empresas

Welcome to Kaggle, the leading platform for predictive modeling competitions. Here’s how to jump into competing on Kaggle —
New to Data Science? Visit our Wiki »
Learn about hosting a competition »
in-Class & Research competitions »

Tags: , ,

UCI Knowledge Discovery in Databases Archive

Arquivo de dados para data mining \ machine learning

Arquivo de dados para data mining \ machine learning

We currently maintain 235 data sets as a service to the machine learning community. You may view all data sets through our searchable interface. Our old web site is still available, for those who prefer the old format. For a general overview of the Repository, please visit our About page. For information about citing data sets in publications, please read our citation policy. If you wish to donate a data set, please consult our donation policy. For any other questions, feel free to contact the Repository librarians. We have also set up a mirror site for the Repository.

Tags: ,

visual exploration of US gun murders

Uma visualização animada muito dramática

Uma visualização animada muito dramática

Information visualization firm Periscopic just published a thoughtful interactive piece on gun murders in the United States, in 2010. It starts with the individuals: when they were killed, coupled with the years they potentially lost. Each arc represents a person, with lived years in orange and the difference in potential years in white. A mouseover on each arc shows more details about that person.

You can then select categories and demographics, which provide comparisons between ethnicities, gun type, sex, and others. Roll over the bar in the middle for a density plot representation.

Finally, specific breakouts on the bottom provide notables in the data and what they mean.

There are many routes that you could take with this data. At its core, it’s a multivariate dataset with many observations over an entire year. But Periscopic pays close attention to the context and the sensitivity of the data. They make the data relatable while also providing a view of the big picture—without stripping away what the data means. See it live here.

Tags: , , , ,

FlowingData Tutorials

Excelentes toturiais sobre visualizações de dados.

Excelentes tutoriais sobre visualizações de dados.

How to Animate Transitions Between Multiple Charts

Getting Started with Charts in R

How to Make an Interactive Choropleth Map

More on Making Heat Maps in R

Mapping with Diffusion-based Cartograms

How to Make an Interactive Network Visualization

A Variety of Area Charts with R

How to Draw in R and Make Custom Plots

How to Visualize and Compare Distributions

How to Make a Sankey Diagram to Show Flow

Interactive Time Series Chart with Filters

Calendar Heatmaps to Visualize Time Series Data

How to Hand Edit R Plots in Inkscape

How to Make a Contour Map

Using Color Scales and Palettes in R

Build Interactive Time Series Charts with Filters

How to map connections with great circles

How to Make Bubble Charts

How to visualize data with cartoonish faces ala Chernoff

How to: make a scatterplot with a smooth fitted line

An Easy Way to Make a Treemap

How to Make a Heatmap – a Quick and Easy Solution

How to Make an Interactive Area Graph with Flare

How to Make a US County Thematic Map Using Free Tools

How to Make a Graph in Adobe Illustrator

How to Make Your Own Twitter Bot – Python Implementation

Grabbing Weather Underground Data with BeautifulSoup

Tags: , , , , ,