Remembering the lives lost to COVID-19 in America

Na tentativa de mostrar a proporção dos números os autores apresentam uma estória gráfica baseada no tamanho de losângulos

As COVID-19 began to spread in the U.S. in March 2020, Trump administration officials estimated 100,000 to 200,000 Americans might die. A worst-case scenario, they said, meant between 1.6 million and 2.2 million might perish. The figures felt staggeringly high.

Two years later, the U.S. has reached 1 million deaths even as COVID has faded from the headlines.

At this grim milestone, we sought to refocus on the scale of loss suffered. Scroll below to see more.

Tags: ,

Who We Spend Time with as We Get Older

Um gráfico de barras horizontais animando com variações ao longo do tempo

By Nathan Yau

In high school, we spend most of our days with friends and immediate family. Then we get older and get jobs, get married, and grow our own families to spend more time with co-workers, spouses, and kids. Here’s how things change, based on a decade of data from the American Time Use Survey, from age 15 to 80.

Tags: , , ,

Data Quality for AI

Uma página da IBM com vários recursos sobre o pré-processamento e avaliação da qualidade dos dados.

This Data Quality for AI (or DQAI, for short) framework of services provides all the tools to enable model developers and data scientists to implement a formalized and systematic program of data preparation, the preliminary and most time consuming step of the model development lifecycle. This framework is appropriate for data being readied for supervised classification or regression tasks. It includes the necessary software to:

— implement quality checks,
— execute remediation,
— generate audit reports,
— automate all the above.

While pipe-lining of tasks is essential for scalability and repeatability, the included capabilities can also be used for custom data exploration and human-guided improvement of models. Utilization of the included services can be productive at any stage in the model development lifecycle, the offering is designed to be especially valuable early in the data processing, in the data preparation stage.

In addition to all that can be accomplished on original data sources, there are methods that, starting from an input dataset, can help synthesize new data — either for supplementation or for replacement — by learning constraints in the original data or having them specified by a developer. This can be helpful when regulatory or contractual issues prohibit direct usage of data in a modeling effort, when it is desirable to explore datasets with different constraints, or when more data is needed for training.

This offering is appropriate for use on both tabular and time series data and new supported modalities being developed.


Tags: , ,

Age of Moms When Kids are Born

Um bom exemplo de gráficos de alfinetes.

By Nathan Yau

People have kids at a wide range of ages, but the moments tend towards where we are in life. There are social norms and biological norms. Based on data from the National Center for Health Statistics, we can see how these ranges shift by child number.

Tags: ,

The World Chess Championship In 5 Charts

Uma descrição de um campeonato de xadrez com gráficos de diferença, histogramas, mapas de calor e gráficos de radar.

How Magnus Carlsen cemented his GOAT status over 11 games.

By Simran Parwani and Oliver Roeder

Published Dec. 14, 2021

This article is part of our 2021 World Chess Championship series.

The 2021 World Chess Championship ended last week with Magnus Carlen of Norway, the world No. 1, defending his title against challenger Ian Nepomniachtchi of Russia. It was Carlsen’s fifth victory in the world championship, a title he has held since 2013, and the match went a long way toward cementing his status as the greatest chess player of all time.

The contest featured some of the best chess ever played by humans, nearly flawless even when examined by modern, superhuman machines. It also featured a few inexplicable blunders, and just three bad moves saw Nepomniachtchi’s chances slip quickly and irretrievably away. The match also generated a lot of data! We’ve charted some of it below.

Tags: , ,

A catalog of all the Covid visualizations

Muitas visualizações, em geral, muito boas e algumas muito originais

The COVID-19 Online Visualization Collection is a project to catalog Covid-related graphics across countries, sources, and styles. They call it COVIC for short, which seems like a stretch for an acronym and a confusing way to introduce a project to people. But, it does categorize over 10,000 figures, which could be useful as a reference and historical context.

Tags: , ,

What People Spend Most of Their Money On, By Income Group, Relatively Speaking

um relatório com muitos gráficos de linhas

By Nathan Yau

The more money people come across, the more things they can and tend to buy. More money on average means bigger houses, more expensive cars, and fancier restaurants. But what if you look at relative spending instead of total dollars?

For example, if a lower income group uses 9 percent of their total spending to pay a mortgage, does the higher income group also pay 9 percent? Or does additional income go to other spending categories?

It varies.

The charts below show how different income groups spend their money, based on data from the Bureau of Labor Statistics for 2020. Each chart represents a spending category. Each column represents an income group.

Tags: , ,

How to Use t-SNE Effectively

Uma explicação bastante completa sobre as dificuldades de interpretação de gráficos obtidos pelo algoritmo t-SNE

Although extremely useful for visualizing high-dimensional data, t-SNE plots can sometimes be mysterious or misleading. By exploring how it behaves in simple cases, we can learn to use it more effectively.

Tags: , , ,

Explained Visually

Boas explicações visuais iterativas de conceitos de ML e matemática

Ordinary Least Squares Regression

Where do betas come from?

EV 9 – 2015/02/12

Principal Component Analysis

Axis of easy.

EV 8 – 2015/01/29

Image Kernels

The kernel’s secret recipe.

EV 6 – 2015/01/20

Eigenvectors and Eigenvalues

No, no. Do it eigen!

EV 5 – 2014/11/28

Pi (π)

Pi me to the moon.

EV 4 – 2014/11/21

Sine and Cosine

Sine on the line.

EV 3 – 2014/11/14

Exponentiation

Growing, growing, gone. AB

EV 2 – 2014/11/07

Markov Chains

Mark on, Markov EV 1 – 2014/10/30 Conditional probability You probably wouldn’t understand.

Tags: ,

Map made of candy corn to show corn production

Um exemplo de um mapa feito com objetos físicos

With candy corn as her medium, Jill Hubley mapped corn production in the United States, based on data from the USDA. With just three hues of yellow, orange, and white and three heights to match, Hubley was able to clearly show the geographical patterns.

Tags: ,