introducing R to a non-programmer in one hour
Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, software
Biostatistics PhD candidate Alyssa Frazee was tasked with teaching her sister, an undergraduate in sociology, how to use R. She had only one hour.
Once you load in a dataset, things start to get fun. We learned a whole bunch of stuff from this data frame, like how to do basic tabulations and calculate summary statistics, how to figure out if you have missing data, and how to fit a simple linear model. This part was pretty fun because my sister started leading the session: instead of me saying “I’m going to show you how to do this,” it was her asking “Hey, could we make a scatterplot?” or “Do you think we could put the best-fit line on that plot?” I was really glad this happened — I hope it meant she was engaged and enjoying herself!
This is the nice thing about R. There are so many built-in functions and packages that you can get something useful with a few lines of code, and you don’t really even have to know what a function is to get started (although you should eventually). Then you can go as far down the rabbit hole as you want.
Tags: análise de dados, bioinformatica, Estat Descritiva, R-software, software estatístico
Posted by Armando Brito Mendes | Filed under estatística, materiais para profissionais
Alex Reinhart, a PhD statistics student at Carnegie Mellon University, covers some of the common analysis mistakes in Statistics Done Wrong.
Statistics Done Wrong is a guide to the most popular statistical errors and slip-ups committed by scientists every day, in the lab and in peer-reviewed journals. Many of the errors are prevalent in vast swathes of the published literature, casting doubt on the findings of thousands of papers. Statistics Done Wrong assumes no prior knowledge of statistics, so you can read it before your first statistics course or after thirty years of scientific practice.
The text is available for free online, and there’s a physical book version on the way.
Tags: análise de dados, data mining, decisão médica, inferência
4 Faces of Big Data
Posted by Armando Brito Mendes | Filed under estatística
The 4 Faces of Big Data Challenges You just Can’t Ignore
Tags: análise de dados, big data, data mining
Data no visualizing.org
Posted by Armando Brito Mendes | Filed under data sets, estatística
Connect with expert sources and join the discussion on Data Channels
Tags: análise de dados, belo, data mining
Reddit Data Is Beautiful
Posted by Armando Brito Mendes | Filed under estatística, software, visualização
Data is Beautiful
A place for visual representations of data: Graphs, charts, maps, etc.
Rules
- A post must be a data visualization.
- Link to original authors or tag as [OC] if you made it.
- Questions must include a visualization. more info
- Infographics belong in /r/infographics
Infographic vs. Visualization? Data from Star Trek? Data ARE? How do I make one? Read the FAQ
Related
- Datasets
- Infographics
- MapPorn
- RedactedCharts
- SampleSize
- Statistics
- Tableau
- Visualization
- Wordcloud
- Wikimedia Commons
Tags: análise de dados, belo, IBM SPSS Statistics, R-software, software estatístico
WEKA: Remote Experiment
Posted by Armando Brito Mendes | Filed under software
Remote experiments enable you to distribute the computing load across multiple computers. In the following we will discuss the setup and operation for HSQLDB and MySQL.
Tags: análise de dados, data mining, DW \ BI, WEKA
kaggle competitions
Posted by Armando Brito Mendes | Filed under Sem categoria
New to Data Science? Visit our Wiki »
Learn about hosting a competition »
in-Class & Research competitions »
Tags: análise de dados, data mining, DW \ BI
UCI Knowledge Discovery in Databases Archive
Posted by Armando Brito Mendes | Filed under data sets, estatística
We currently maintain 235 data sets as a service to the machine learning community. You may view all data sets through our searchable interface. Our old web site is still available, for those who prefer the old format. For a general overview of the Repository, please visit our About page. For information about citing data sets in publications, please read our citation policy. If you wish to donate a data set, please consult our donation policy. For any other questions, feel free to contact the Repository librarians. We have also set up a mirror site for the Repository.
Tags: análise de dados, data mining
visual exploration of US gun murders
Posted by Armando Brito Mendes | Filed under estatística, visualização
Information visualization firm Periscopic just published a thoughtful interactive piece on gun murders in the United States, in 2010. It starts with the individuals: when they were killed, coupled with the years they potentially lost. Each arc represents a person, with lived years in orange and the difference in potential years in white. A mouseover on each arc shows more details about that person.
You can then select categories and demographics, which provide comparisons between ethnicities, gun type, sex, and others. Roll over the bar in the middle for a density plot representation.
Finally, specific breakouts on the bottom provide notables in the data and what they mean.
There are many routes that you could take with this data. At its core, it’s a multivariate dataset with many observations over an entire year. But Periscopic pays close attention to the context and the sensitivity of the data. They make the data relatable while also providing a view of the big picture—without stripping away what the data means. See it live here.
Tags: análise de dados, belo, captura de conhecimento, data mining, Estat Descritiva
FlowingData Tutorials
Posted by Armando Brito Mendes | Filed under estatística, visualização
How to Animate Transitions Between Multiple Charts
Getting Started with Charts in R
How to Make an Interactive Choropleth Map ☆
More on Making Heat Maps in R ☆
Mapping with Diffusion-based Cartograms ☆
How to Make an Interactive Network Visualization
A Variety of Area Charts with R ☆
How to Draw in R and Make Custom Plots ☆
How to Visualize and Compare Distributions
How to Make a Sankey Diagram to Show Flow ☆
Interactive Time Series Chart with Filters ☆
Calendar Heatmaps to Visualize Time Series Data ☆
How to Hand Edit R Plots in Inkscape ☆
How to Make a Contour Map ☆
Using Color Scales and Palettes in R ☆
Build Interactive Time Series Charts with Filters ☆
How to map connections with great circles
How to Make Bubble Charts
How to visualize data with cartoonish faces ala Chernoff
How to: make a scatterplot with a smooth fitted line
An Easy Way to Make a Treemap
How to Make a Heatmap – a Quick and Easy Solution
How to Make an Interactive Area Graph with Flare
How to Make a US County Thematic Map Using Free Tools
How to Make a Graph in Adobe Illustrator
How to Make Your Own Twitter Bot – Python Implementation
Grabbing Weather Underground Data with BeautifulSoup
Tags: análise de dados, captura de conhecimento, data mining, desnvolvimento de software, Estat Descritiva, R-software