MARRYING YOUNGER AND OLDER
Posted by Armando Brito Mendes | Filed under visualização
Um histograma dinâmico com uma barra móvel
By Nathan Yau
In our earlier years, we tend to date and marry others who are around our age. However, this is not true for everyone. Variation kicks in when you look at the later years, consider multiple marriages, divorce, separation, and opposite-sex versus same-sex relationships.
Check the following interactive chart to see how the age distributions break down, among partners who live together.
Tags: casamento, histograma, idades
Mapping America’s access to nature, neighborhood by neighborhood
Posted by Armando Brito Mendes | Filed under infogramas \ dashboards, mapas SIG's, visualização
Relatório com Mapas e um bom Gráfico de Bolhas
Analysis by Harry StevensClimate Lab columnist
April 10, 2024 at 7:30 a.m.
Share
Comment
Add to your saved stories
Save
A city is a science experiment. What happens when we separate human beings from the environment in which they evolved? Can people be healthy without nature? The results have beenbleak. Countless studies have shown that people who spend less time in nature die younger and suffer higher rates of mental and physical ailments.
“There’s a really, really strong case for proximity to nature influencing health in a really big way,” said Jared Hanley, the co-founder and CEO of NatureQuant, an Oregon start-up whose mission is to discover what kind of nature best supports human health, map where it is and persuade people to spend more time in it.
Using satellite imagery and data on dozens of factors — including air and noise pollution, park space, open water and tree canopy — NatureQuant has distilled the elements of health-supporting nature into a single variable called NatureScore. Aggregated to the level of Census tracts — roughly the size of a neighborhood — the data provide a high-resolution image of where nature is abundant and where it is lacking across the United States.
National Longitudinal Surveys
Posted by Armando Brito Mendes | Filed under data sets, visualização
dados de inquéritos americanos com ficheiros muito grandes
Accessing NLS Data
Public-Use Data
NLS public-use data for each cohort are available at no cost via Investigator, an online search and extraction site that enables you to review NLS variables and create your own data sets. It is not necessary to get an account to browse data, but an account is necessary to save datasets online.
The Investigator User’s Guide describes how to use this website.
An available tutorial also teaches how to search for variables in the Investigator.
For users who have the capacity to utilize extremely large data files and the programs to handle them, downloads are available for NLSY97, NLSY79, and NLSY79 Child and Young Adult.
Tags: inquéritos, labor, statistics, survey
Look into the machine’s mind
Posted by Armando Brito Mendes | Filed under LLMs, visualização
Uma web app capaz de explorar os vários caminhos obtidos da resposta “what is intelligence” do chatGPT
the data
Using the chatgpt api, I ran the same completion prompt “Intelligence is “ hundreds of times (setting the temperature quite high, at 1.6, for more diverse responses). Given a text, a Large Language Model assigns a probability for the word (token) to come, and it just repeats this process until a completion is…well, complete.
semantic space (behind)
Each text (a prompt completion or a sub-sequence) has an embedding: a position in a 1536-dimensions space (I call it semantic space, or s²₁₅₃₆). For each response there’s a trajectory through s²₁₅₃₆ that corresponds to each sub-sequence of words, example: “Intelligence is “ → “Intelligence is the” → “Intelligence is the ability” → “Intelligence is the ability to” → … → full completion.
Because I cannot visualize a 1536-dimensions space (yet), I use a popular technique called Principal Components Analysis that tells me, for the set of points I have, what are the most important (principal) dimensions, and allows me to rotate the highly dimensional space so when I look through it, projected into only 3 dimensions, the points are scattered as much as possible. It’s the best (linear)possible reduction of dimensions. In fewer words: it compresses a highly dimensional space into few dimensions while preserving as much info as it can. More or less the same as when for drawing something you choose a perspective (you rotate the object), so it provides the most relevant information. I call this new space s²₃, and it’s what I visualize.
What you see in the cube is a tree of trajectories that bifurcate. All start with “Intelligence is “ and progress towards longer and less probable sub-sequences of responses. It’s a different representation of the same tree being visualized on the right (both visualizations communicate).
The tree visualization (right)
Visualizes all collected completions. It also represents the calculated probability of a word following a text (because the sample is small, this is only a good approximation for the initial levels of the tree), so “Intelligence is the “ will be followed by “ability” ~75% of the times, at 1.6 temperature. If temperature was lower this probability would rise, until achieving certainity at temperature=0.
By hovering a word, which corresponds to a point in a sub-sequence, you can see in the cube the trajectory from the prompt to all the completions that start with that sub-sequence.
Try other prompts:
· Chatgpt is
· Best thing about AI is
· When
· Santiago Ortiz is (yes, this is a selfai. What I found interesting is that it’s ~50% truth ~50% bs, and it feels like it describes alternative versions of my self in the multiverse)
· My dream
· Tell me a story:
· Intelligence is
references
Simulating my friend Philippe, where I explain embeddings, and how they are used to run semantic search and to find the proper knowledge from a corpus to use it as context for LLMs prompts
A deeper explanation of LLMs, next token prediction, temperature and embeddings, by Stephen Wolfram
English by degrees the original Next Word prediction model by Claude Shannon
moebio for more experiments and data proyects
Tags: bifurcações, chatGPT, word network
When Your Vision and Hearing Decline with Age
Posted by Armando Brito Mendes | Filed under Data Science, infogramas \ dashboards, visualização
Bons gráficos de linhas com ajuste de curvas
By Nathan Yau
If you want to feel like you’re getting old, visit an optometrist and have them tell you that in 6 to 12 months you won’t be able to read things up close and you’ll need bifocals.
For most of my life, I had good vision without glasses or contacts, but in my mid-30s I noticed the basketball score on television looking kind of blurry. I had astigmatism. Just a little.
My prescription didn’t change for years. Until recently. My optometrist hit me with the news that most people start to have trouble reading up close between 39 to 43 years old. I had to look into it.
The following chart shows the percentage of adults who wear glasses or contacts, by age, based on data from the National Health Interview Survey.
Tags: ajuste de curvas, gráficos de pontos, idade, perda de audição, perda de visão, velhice
Airfoil
Posted by Armando Brito Mendes | Filed under infogramas \ dashboards, lições, visualização
Excelentes animações sobre fenómenos físicos como o fluxo de ar em asas de avião ou noutros meios
The dream of soaring in the sky like a bird has captivated the human mind for ages. Although many failed, some eventually succeeded in achieving that goal. These days we take air transportation for granted, but the physics of flight can still be puzzling.
In this article we’ll investigate what makes airplanes fly by looking at the forces generated by the flow of air around the aircraft’s wings. More specifically, we’ll focus on the cross section of those wings to reveal the shape of an airfoil
Tags: animações, física, fluxo de ar, visualizações
Common Age Differences, Married Couples
Posted by Armando Brito Mendes | Filed under Data Science, visualização
Bons gráficos de alfinetes e de dispersão com outlier
By Nathan Yau
Through pop culture, it sometimes seems like it’s common for there to be a wide age difference between spouses. How common are the age gaps, really? These are the age differences through the lens of the 2022 five-year American Community Survey.
Tags: casais, Estat Descritiva, gráfico de alfinetes, gráfico de dispersão, idade, outlier
Why Line Chart Baselines Can Start at Non-Zero
Posted by Armando Brito Mendes | Filed under Data Science, estatística, lições, visualização
Uma boa demonstração, com gráficos dinâmicos, de como os gráficos podem ser enganadores
By Nathan Yau
There is a recurring argument that line chart baselines must start at zero, because anything else would be misleading, dishonest, and an insult to all that is good in the world. The critique is misguided.
Tags: enganador, gáfico de linhas, gráficos, linha base
Full Of Themselves
Posted by Armando Brito Mendes | Filed under Data Science, visualização
Um relatório de tratamento de dados muito bem explicado
An analysis of title drops in movies
by Dominikus Baur + Alice Thudt
A title drop is when a character in a movie says the title of the movie they’re in. Here’s a large-scale analysis of 73,921 movies from the last 80 years on how often, when and maybe even why that happens.
Tags: análise de dados, filmes, IMDb, visualização
Feeling Rested with Age
Posted by Armando Brito Mendes | Filed under Sem categoria
gráfico de barras acumuladas ou separadas
By Nathan Yau
How much you sleep each night matters, but more importantly, it’s about the quality and if you feel rested when you wake up. This seems to shift with age as responsibilities and sleep patterns change.
The following chart shows how rested people felt, based on answers to the American Time Use Survey.
BETTER REST WHEN OLDER
People were asked, “When you woke up yesterday, how well-rested did you feel?” By age 50 to 59, the very-well group passes the halfway mark.