What are you going to do with that degree?
Posted by Armando Brito Mendes | Filed under estatística, visualização
Jobs by college major
This is a quick Sankey visualization of how college majors relate to professions, based on data from the American Community survey. On the left are the largest college majors; to the right are the most common professions.
To see broad fields like “Sciences” and “Humanities”, see the edited version of this page.
The width of each stream shows how many people with that major are in that field. (The color shows whether that’s more or fewer people than expected based on how big the major is: hover over to see just how many more it is.) The width of each stream shows how many people with that major are in that field. (The color shows whether that’s more or fewer people than expected based on how big the major is).
You surely see that the lines are too small to understand in most cases: to actually see what’s going on with a particular field or job, click on a box and the chart will filter down to just the people who either majored in the field, or ended up employed in the job. (Click on one of the connecting lines to see both at once.)
I have not developed this that far because I am not sure how useful it ultimately is: my basic goal was a quick way to see, for example, what jobs history majors ended up in. (Largest is lawyers, but also schoolteachers; what you would expect, but worth knowing.)
You might also like my visualization of changing college degrees over time.
Tags: belo, Estat Descritiva
Tutorial: How to detect spurious correlations
Posted by Armando Brito Mendes | Filed under estatística, materiais ensino
Tutorial: How to detect spurious correlations, and how to find the real ones
Specifically designed in the context of big data in our research lab, the new and simple strong correlation synthetic metric proposed in this article should be used, whenever you want to check if there is a real association between two variables, especially in large-scale automated data science or machine learning projects. Use this new metric now, to avoid being accused of reckless data science and even being sued for wrongful analytic practice.
Tags: data mining, Estat Descritiva, inferência
Site sobre visualização da GE.com
Posted by Armando Brito Mendes | Filed under estatística, visualização
GE Works. Building, Moving, Powering and Curing the world. In the process, our technologies are generating data on a petabyte scale. This data contains valuable information that will drive insights, innovations, and discoveries, but it can be difficult to access and digest. Using data visualization, we’re pairing science and design to simplify the complexity and drive a deeper understanding of the context in which we operate.
We encourage you to explore the projects below.
For further information about GE’s data visualization program, please contact us at datavizinfo@ge.com
To share your own visualizations, please visit www.visualizing.org
Tags: análise de dados, belo, data mining, Estat Descritiva, mapas
Posted by Armando Brito Mendes | Filed under estatística, visualização
Data Visualization – Banking Case Lab : Microsoft Excel – use Secondary Axis to Create Two Y Axes
25th May, 2014 · Roopam Upadhyay
Analytics Lab
Banking Case
Using Secondary Axis to Create Two Y Axes in Excel
Tags: Estat Descritiva, Excel
Humor com gráficos kindofnormal
Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, materiais para profissionais, visualização
Alguns exemplos:
Tags: belo, Estat Descritiva
khanacademy: Valor esperado
Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, videos
Vídeo original: Expected Value: E(X) (https://www.khanacademy.org/math/probability/random-variables-topic/random_variables_prob_dist/v/expected-value–e-x) A Khan Academy Portugal disponibiliza explicações online de Matemática gratuitas desde o 1º até ao 12º ano de escolaridade. Este vídeo foi produzido pela Khan Academy e traduzido para português pela Fundação Portugal Telecom (ver todos os vídeos disponíveis em http://fundacao.telecom.pt/khanacademy).
Tags: Estat Descritiva
khanacademy: Variáveis aleatórias
Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, videos
Vídeo original: Introduction to Random Variables? (https://www.khanacademy.org/math/probability/independent-dependent-probability/old_prob_videos/v/introduction-to-random-variables) A Khan Academy Portugal disponibiliza explicações online de Matemática gratuitas desde o 1º até ao 12º ano de escolaridade. Este vídeo foi produzido pela Khan Academy e traduzido para português pela Fundação Portugal Telecom (ver todos os vídeos disponíveis em http://fundacao.telecom.pt/khanacademy).
Tags: Estat Descritiva, inferência
9 “must read” articles on Big Data
Posted by Armando Brito Mendes | Filed under estatística, materiais para profissionais, visualização
My selection
- Big Data – From Descriptive to Prescriptive
- Can big data be racist?
- NodeXL Graph Gallery: Graph Details
- Best Metrics For Digital Marketing: Rock Your Own And Rent Strategies
- Big Data: from mining to meaning
- Beautiful versus useful visualizations (in French, but interesting)
- Learning and Teaching Machine Learning: A Personal Journey
- Big data techniques and technologies
- The Sexiest Job of the 21st Century is Tedious, and that Needs to C… (*)
- From the trenches: 360-degree data science
(*) I disagree with this Harvard Business Review author. Senior data scientists work on high level data from various sources, use automated processes for EDA (exploratory analysis) and spend little to no time in tedious, routine, mundane tasks (less than 5% of my time, in my case). I also use robust techniques that work well on relatively dirty data, and … I create and design the data myself in many cases.
Tags: belo, big data, data mining, Estat Descritiva, grafos
Erros em gráficos na notícias
Posted by Armando Brito Mendes | Filed under estatística, visualização
Fox News bar chart gets it wrong
Because Fox News. See also this, this, and this. [Thanks, Meron]
Tags: análise de dados, belo, data mining, Estat Descritiva
SPSS Macros on the Internet
Posted by Armando Brito Mendes | Filed under estatística, materiais para profissionais, software
What sources of SPSS macros are available on the Internet?
Here are a few that I know about; I hope other people will tell us about ones that should be listed but aren’t.
An obvious starting point is SPSS Inc’s own Macro Library at http://www.spss.com/tech/stat/macros/ (it doesn’t contain very many, though, and they are statistical rather than utilities). If you are planning to adapt or write macros, it’s also worth seeing what’s in SPSS Inc’s AnswerNet Solutions. Go to http://www.spss.com/tech/answer/, specify Product; SPSS Base and Free Text: macro, then click on the page’s Search button.Raynald Levesque’s site http://pages.infinit.net/rlevesqu/ includes many pages on macros (including examples and some tutorial materials). But you should also look at the examples in his pages on syntax, as some of these are based on macros.
Newsgroups are also a useful source of macros. Searches of their archives can be very rewarding if you can get your search terms right (see our Other Internet Resources page).
Confidence intervals for proportions, differences between proportions and related quantities. See Dr Robert G. Newcombe’s home page at http://www.uwcm.ac.uk/uwcm/ms/Robert.html. Note that these are SPSS programs rather than macros, despite being described as macros by the author.
Polytomous logistic regression (of particular interest to users of SPSS 8.0 and earlier). For macros by John Hendrickx and Prof. Dr. Steffen Kühnel see http://www.sls.wau.nl/bk/bedrijfskunde/jhendrickx/spss/mlogist/
Regression: evaluating collinearity in models with interactions or non-linear terms. For a macro by Ben Pelzer, Manfred te Grotenhuis, Jan Lammers, John Hendrickx, see http://www.sls.wau.nl/bk/bedrijfskunde/jhendrickx/spss/perturb/perturb.html
Tags: Estat Descritiva, IBM SPSS Statistics, inferência, software estatístico