Armando B. Mendes

Department for Transport publishes official statistics relating to the transport system in Great Britain

Posted by Armando Brito Mendes | Filed under data sets

Bons conjuntos de dados para utilizar em projetos.

Department for Transport publishes official statistics relating to the transport system in Great Britain.

Read more | Comments off | June 4th, 2020

Conjuntos de dados da Google Cloud Platform

Posted by Armando Brito Mendes | Filed under data sets

Os data sets disponíveis nos serviços cloud da google.

About COVID-19 Public DatasetsBigQuery Public Datasets ProgramGetting started with COVID-19 Public Datasets

American Community Survey (ACS)United States Census BureauDetailed US demographic data at various geographic resolutions

Argentina Real Estate ListingsProperatiMonthly property listing data for Argentina since 2016

Austin Crime DataCity of AustinCity of Austin crime data for 2014 and 2015

Births Data SummaryCenters for Disease ControlNatality Data from CDC Births

Bitcoin Cash Cryptocurrency DatasetBitcoin CashThe Bitcoin Cash blockchain loaded to BigQuery and updated daily

Tags: bigQuery, dados, google

Read more | Comments off | April 19th, 2020

Journal of Physics: Conference Series

Posted by Armando Brito Mendes | Filed under estatística, Investigação Operacional, matemática, refs bibliográficas

longa lista de artigos resultantes de conferências de Matemática e Física

The open access Journal of Physics: Conference Series (JPCS) provides a fast, versatile and cost-effective proceedings publication service.

Tags: artigos científicos

Read more | Comments off | March 6th, 2020

A Beginner’s Guide to learn web scraping with python!

Posted by Armando Brito Mendes | Filed under lições, materiais ensino, materiais para profissionais, software

Boa descrição de web scraping com Python

Web Scraping with Python

Imagine you have to pull a large amount of data from websites and you want to do it as quickly as possible. How would you do it without manually going to each website and getting the data? Well, “Web Scraping” is the answer. Web Scraping just makes this job easier and faster.

In this article on Web Scraping with Python, you will learn about web scraping in brief and see how to extract data from a website with a demonstration. I will be covering the following topics:

Tags: Python

Read more | Comments off | March 3rd, 2020

Build Pipelines with Pandas Using pdpipe

Posted by Armando Brito Mendes | Filed under linguagens de programação, software

KDnuggets — clique na imagem para seguir o link

Boa descrição de pipelines com os data.frame do Pandas.

Introduction

Pandas is an amazing library in the Python ecosystem for data analytics and machine learning. They form the perfect bridge between the data world, where Excel/CSV files and SQL tables live, and the modeling world where Scikit-learn or TensorFlow perform their magic.

A data science flow is most often a sequence of steps — datasets must be cleaned, scaled, and validated before they can be ready to be used by that powerful machine learning algorithm.

These tasks can, of course, be done with many single-step functions/methods that are offered by packages like Pandas but a more elegant way is to use a pipeline. In almost all cases, a pipeline reduces the chance of error and saves time by automating repetitive tasks.

In the data science world, great examples of packages with pipeline features are — dplyr in R language, and Scikit-learn in the Python ecosystem.

A data science flow is most often a sequence of steps — datasets must be cleaned, scaled, and validated before they can be ready to be used

Following is a great article about their use in a machine-learning workflow.

Managing Machine Learning Workflows with Scikit-learn Pipelines Part 1: A Gentle Introduction
Are you familiar with Scikit-learn Pipelines? They are an extremely simple yet very useful tool for managing machine…

Pandas also offer a .pipe method which can be used for similar purposes with user-defined functions. However, in this article, we are going to discuss a wonderful little library called pdpipe, which specifically addresses this pipelining issue with Pandas DataFrame.

In almost all cases, a pipeline reduces the chance of error and saves time by automating repetitive tasks

Tags: data mining, pandas, pipelines, Python

Read more | Comments off | February 13th, 2020

The Beautiful Hidden Logic of Cities

Posted by Armando Brito Mendes | Filed under mapas SIG's, materiais ensino, materiais para profissionais, visualização

clicar na imagem para seguir o link

Padrões identificados em mapas de cidades.

After finishing my map of the most common road suffixes by length, I realized I could also map each individual road, colored by its suffix. This has led to the loveliest maps I’ve made.

Driving around your city, you’re probably somewhat aware of Avenues and Boulevards and Streets and Roads and so on. Here in Portland, at least, I know that Avenues run north-south and Streets run east-west. However, it’s hard to get an overall view of how all these road designations knit together. By coloring them, we can suddenly see a new, stunning view of what we normally take for granted.

Tags: captura de conhecimento, image mining, mapas

Read more | Comments off | September 30th, 2019

Machine Learning and Data Science Cheat Sheet

Posted by Armando Brito Mendes | Filed under Sem categoria

clique na imagem para seguir o link

Pequeno tutorial (com muitos links) sobre linux e machine learning

You can download the new machine learning cheat sheet here (PDF format, 14 pages.)

Originally published in 2014 and viewed more than 200,000 times, this is the oldest data science cheat sheet – the mother of all the numerous cheat sheets that are so popular nowadays. I decided to update it in June 2019. While the first half, dealing with installing components on your laptop and learning UNIX, regular expressions, and file management hasn’t changed much, the second half, dealing with machine learning, was rewritten entirely from scratch. It is amazing how things have changed in just five years!

Tags: data mining, Linux, machine learning

Read more | Comments off | September 5th, 2019

Usage of Asterisks in Python

Posted by Armando Brito Mendes | Filed under linguagens de programação

clique na imagem para seguir o link

Um tutorial sobre os vários usos do * no Python

Many Python users are familiar with using asterisks for multiplication and power operators, but in this tutorial, you’ll find out additional ways on how to apply the asterisk.

Most of us use asterisks as multiplication and power operators, but they also have a wide range of operations being used as a prefix operator in Python. After reading this article, you will get to know the full usage of asterisks.

Asterisks have many particular use cases in Python. In general, we are familiar with the multiplication and power operators. It can perform some other operations like unpacking, arguments passing, etc.., in different situations. First, let’s see the general usage of asterisks.

Tags: Python

Read more | Comments off | September 5th, 2019

The Sleep Blanket

Posted by Armando Brito Mendes | Filed under visualização

clicar na imagem para seguir o link

A visualization of my son’s sleep pattern from birth to his first birthday. Crochet border surrounding a double knit body. Each row represents a single day. Each stitch represents 6 minutes of time spent awake or asleep #knitting #crochet #datavisualization

Tags: belo

Read more | Comments off | July 31st, 2019

Making of the Illustrations of the Natural Orders of Plants

Posted by Armando Brito Mendes | Filed under materiais para profissionais

clique na imagem para seguir o link

If someone told me when I was young that I would spend three months of my time tracing nineteenth century botanical illustrations and enjoy it, I would have scoffed, but that’s what I did to reproduce Elizabeth Twining’s Illustrations of the Natural Orders of Plants and I loved every minute.

After the unexpected successes of my Byrne’s Euclid and Werner’s Nomenclature of Colours projects (for which I’m very grateful) I got the itch to follow them up with another reproduction of an obscure catalog from the 1800s. However, finding interesting obscure catalogs want an easy task when I didn’t know what would pique my interest. Anything was fair game but I had an inkling that something based on the sciences would be most interesting. Scientific catalogs are organized, structured, and data can be extracted from them with some elbow grease.

Tags: belo, captura de conhecimento, data mining, machine learning

Read more | Comments off | July 17th, 2019

« Older Entries

Newer Entries »

Armando B. Mendes

Department for Transport publishes official statistics relating to the transport system in Great Britain

Contents

Conjuntos de dados da Google Cloud Platform

Journal of Physics: Conference Series

A Beginner’s Guide to learn web scraping with python!

Boa descrição de web scraping com Python

Build Pipelines with Pandas Using pdpipe

The Beautiful Hidden Logic of Cities

Machine Learning and Data Science Cheat Sheet

Usage of Asterisks in Python

The Sleep Blanket

Making of the Illustrations of the Natural Orders of Plants

Categorias de Posts

Palavras chave mais usadas

Arquivo

Recent Posts

Recent Comments

About