Data for Tat
Posted by Armando Brito Mendes | Filed under data sets
Data for: Tat will tell: Tattoos and time preferences
Published: 21 October 2019| Version 1 | DOI: 10.17632/p7xw6yvd5c.1
Contributors:Bradley Ruffle,
Description
Dataset in Stata 10 format, collected from incentivized experiments and survey.
Categories: Economics, Social Psychology
This dataset is supplement to
*provided by DataCite
figshare – a home for research outputs
Posted by Armando Brito Mendes | Filed under Data Science, data sets, estatística
Uma excelente fonte de dados e estudos
the repository built to showcase all of your institution’s research outputs in one place
Tags: dados, data, estudos, research
Data Quality for AI
Posted by Armando Brito Mendes | Filed under Data Science, materiais ensino, materiais para profissionais
Uma página da IBM com vários recursos sobre o pré-processamento e avaliação da qualidade dos dados.
This Data Quality for AI (or DQAI, for short) framework of services provides all the tools to enable model developers and data scientists to implement a formalized and systematic program of data preparation, the preliminary and most time consuming step of the model development lifecycle. This framework is appropriate for data being readied for supervised classification or regression tasks. It includes the necessary software to:
— implement quality checks,
— execute remediation,
— generate audit reports,
— automate all the above.
While pipe-lining of tasks is essential for scalability and repeatability, the included capabilities can also be used for custom data exploration and human-guided improvement of models. Utilization of the included services can be productive at any stage in the model development lifecycle, the offering is designed to be especially valuable early in the data processing, in the data preparation stage.
In addition to all that can be accomplished on original data sources, there are methods that, starting from an input dataset, can help synthesize new data — either for supplementation or for replacement — by learning constraints in the original data or having them specified by a developer. This can be helpful when regulatory or contractual issues prohibit direct usage of data in a modeling effort, when it is desirable to explore datasets with different constraints, or when more data is needed for training.
This offering is appropriate for use on both tabular and time series data and new supported modalities being developed.
Tags: data, data preparation, data quality