Warm and cold weather anomalies

Posted by Armando Brito Mendes | Filed under visualização

Mais um exemplo de boas visualizações, agora com dados de clima

This year’s polar vortex churned up some global warming skeptics, but as we know, it’s more useful to look at trends over significant spans of time than isolated events. And, when you do look at a trend, it’s useful to have a proper baseline to compare against.

To this end, Enigma.io compared warm weather anomalies against cold weather anomalies, from 1964 to 2013. That is, they counted the number of days per year that were warmer than expected and the days it was colder than expected.

An animated map leads the post, but the meat is in the time series. There’s a clear trend towards more warm.

Since 1964, the proportion of warm and strong warm anomalies has risen from about 42% of the total to almost 67% of the total – an average increase of 0.5% per year. This trend, fitted with a generalized linear model, accounts for 40% of the year-to-year variation in warm versus cold anomalies, and is highly significant with a p-value approaching 0.0. Though we remain cautious about making predictions based on this model, it suggests that this yearly proportion of warm anomalies will regularly fall above 70% in the 2030’s.

Explore in full or download the data and analyze yourself. Nice work. [Thanks, Dan]

Tags: belo, mapas

Read more | Comments off | April 21st, 2014

High-detail maps with Disser

Posted by Armando Brito Mendes | Filed under mapas SIG's, software, visualização

Software open source para trabalhar com mapas

Open data consultancy Conveyal released Disser, a command-line tool to disaggregate geographic data to show more details. For example, we’ve seen data represented with uniformly distributed dots to represent populations, which is fine for a zoomed out view. However, when you get in close, it can be useful to see distributions more accurately represented.

If the goal of disaggregation is to make a reasonable guess at the data in its pre-aggregated form, we’ve done an okay job. There’s an obvious flaw with this map, though. People aren’t evenly distributed over a block — they’re concentrated into residential buildings.

So Disser combines datasets of different granularity, so that you can see spreads and concentrations that are closer to real life.

Tags: belo, image mining, mapas

Read more | Comments off | April 14th, 2014

9 “must read” articles on Big Data

Posted by Armando Brito Mendes | Filed under estatística, materiais para profissionais, visualização

Textos para big data

My selection

(*) I disagree with this Harvard Business Review author. Senior data scientists work on high level data from various sources, use automated processes for EDA (exploratory analysis) and spend little to no time in tedious, routine, mundane tasks (less than 5% of my time, in my case). I also use robust techniques that work well on relatively dirty data, and … I create and design the data myself in many cases.

Tags: belo, big data, data mining, Estat Descritiva, grafos

Read more | Comments off | April 8th, 2014

Erros em gráficos na notícias

Posted by Armando Brito Mendes | Filed under estatística, visualização

Três exemplos de erros em gráficos nos canais de notícias

Fox News bar chart gets it wrong

Because Fox News. See also this, this, and this. [Thanks, Meron]

Tags: análise de dados, belo, data mining, Estat Descritiva

Read more | Comments off | April 4th, 2014

Exponential water tank

Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, visualização

Uma excelente forma de perceber a distribuição exponencial

Hibai Unzueta, based on a paper by Albert Bartlett, demonstrates exponential growth with a simple animation. It depicts a man standing in a tank with finite capacity and water rising slowly, but at an exponential rate.

Our brains are wired to predict future behaviour based on past behaviour (see here). But what happens when something growths exponentially? For a long time, the numbers are so little in relation to the scale that we hardly see the changes. But even at moderate growth rates exponential functions reach a point where the numbers grow too fast. Once we confirm that our predictions about the future have failed, very little time to react may be left.

All looks safe at first, because the water rises so slowly, but it seems to rise all of a sudden. Oh, the suspense. What will happen to cartoon pixel man?

Tags: definição, inferência

Read more | Comments off | April 2nd, 2014

What’s your kind of beer?

Posted by Armando Brito Mendes | Filed under estatística, visualização

Um bom exemplo de um site cheio de visualizações tipo dashboard

What’s your kind of beer?

Choose your preferred beer strength to begin exploring similar beers.

Explore Similar Beers by:

Overall
Aroma
Taste
Appearance

About the Data

Popularity and top beer styles are based on the number of users who rated the beer.

Tags: belo

Read more | Comments off | March 17th, 2014

Read Histograms and Use Them in R

Posted by Armando Brito Mendes | Filed under estatística, materiais para profissionais, visualização

Bom tutorial para construir histogramas no R

Tutorials / histogram, R

How to Read Histograms and Use Them in R

By Nathan Yau

The chart type often goes overlooked because people don’t understand them. Maybe this will help.

Download Source

The histogram is one of my favorite chart types, and for analysis purposes, I probably use them the most. Devised by Karl Pearson (the father of mathematical statistics) in the late 1800s, it’s simple geometrically, robust, and allows you to see the distribution of a dataset.

If you don’t understand what’s driving the chart though, it can be confusing, which is probably why you don’t see it often in general publications.

Tags: análise de dados, data mining, Estat Descritiva, R-software, software estatístico

Read more | Comments off | March 13th, 2014

Useful Videos on Information Visualization

Posted by Armando Brito Mendes | Filed under estatística, videos, visualização

Bons videos sobre visualização de dados

Noah Iliinsky – Data Visualizations Done Wrong – A Beautiful Collection of Stories and Tips for Success.

The Four Pillars of Data Visualization

Designing Data Visualizations with Noah Iliinsky

Best Practices for Data Visualization

Designing Data Visualizatins

Seeing the Story in the Data and Learning to Effectively Communicate – Inspired by Stephen Few Principles, Visualization Guru

David McCandless: “The beauty of data visualization” – Data Detective Telling Stories From Visualization of Information

This also has a nice quiz about visualization principles.

As I collect more, I will consolidate this list.

Tags: belo, big data, data mining, image mining

Read more | Comments off | February 27th, 2014

selfiecity

Posted by Armando Brito Mendes | Filed under videos, visualização

Um estudo sobre este tipo de fotos com muito boas visualizações

Investigating the style of self-portraits (selfies) in five cities across the world.

Selfiecity investigates selfies using a mix of theoretic, artistic and quantitative methods:

We present our findings about the demographics of people taking selfies, their poses and expressions.
Rich media visualizations (imageplots) assemble thousands of photos to reveal interesting patterns.
The interactive selfiexploratory allows you to navigate the whole set of 3200 photos.
Finally, theoretical essays discuss selfies in the history of photography, the functions of images in social media, and methods and dataset.

Selfiecity, from Lev Manovich, Moritz Stefaner, and a small group of analysts and researchers, is a detailed visual exploration of 3,200 selfies from five major cities around the world. The project is both a broad look at demographics and trends, as well as a chance to look closer at the individual observations.

There are several components to the project, but Imageplots (which you might recognize from a couple years ago) and the exploratory section, aptly named Selfiexploratory, will be of most interest.

The two parts let you filter through cities (Bangkok, Berlin, Moscow, New York, and Sao Paulo), age, gender, pose, mood, and a number of other factors, and this information is presented in a grid layout that self-updates as you browse.

So you can get a rough sense of how facets relate. There seems to be a higher proportion of female selfies and average age seems to skew towards younger as you’d expect. The average age of females in this selfie sample seems to be younger than that of males.

However, before you jump to too many conclusions about how countries vary or differences between the sexes, etc, consider the classification process, which was a combination of manual labor via Mechanical Turk and face recognition software. Age, for example, can be though to estimate from pictures alone since you have outside factors like makeup, angles, and poses. Do these things account for the two- to three-year average difference between the sexes? Maybe. So consider the data. But that should go without saying.

That said, Selfiecity is a fun one I spent a good amount of time browsing. It’s a weird, tiny peek into 3,200 people’s lives, with a dose of quant and art. And don’t miss the theoretical component in essay format, a reflection of social media, communities, and the self.

Tags: belo, data mining, image mining

Read more | Comments off | February 26th, 2014

Data Intelligence and Analytics Resources

Posted by Armando Brito Mendes | Filed under materiais para profissionais, software, videos, visualização

Excelentes textos sobre ciencia dos dados e big data

3. Big Data

4. Visualization

5. Best and Worst of Data Science

6. New Analytics Start-up Ideas

7. Rants about Healthcare, Education, etc.

8. Career Stuff, Training, Salary Surveys

9. Miscellaneous

10. DSC Webinar Series – with video access

Tags: big data, captura de conhecimento, data mining, R-software

Read more | Comments off | February 26th, 2014

« Older Entries

Newer Entries »

Armando B. Mendes

Warm and cold weather anomalies

High-detail maps with Disser

9 “must read” articles on Big Data

Erros em gráficos na notícias

Fox News bar chart gets it wrong

Exponential water tank

What’s your kind of beer?

What’s your kind of beer?

Explore Similar Beers by:

About the Data

Popularity and top beer styles are based on the number of users who rated the beer.

Read Histograms and Use Them in R

How to Read Histograms and Use Them in R

Useful Videos on Information Visualization

selfiecity

Data Intelligence and Analytics Resources

Categorias de Posts

Palavras chave mais usadas

Arquivo

Recent Posts

Recent Comments

About