portal smart datacollective.com

Um portal de notícias sobre ciencia dos dados, big data, analytics

Um portal de notícias sobre ciencia dos dados, big data, analytics

SmartData Collective, an online community moderated by Social Media Today, provides enterprise leaders access to the latest trends in Business Intelligence and Data Management. Our innovative model serves as a platform for recognized, global experts to share their insights through peer contributions, custom content publishing and alignment with industry leaders. SmartData Collective is a key resource for executives who need to make informed data management decisions.

Tags: , , , , ,

Data Intelligence and Analytics Resources

Excelentes textos sobre ciencia dos dados e big data

Excelentes textos sobre ciencia dos dados e big data

3. Big Data

4. Visualization

5. Best and Worst of Data Science

6. New Analytics Start-up Ideas

7. Rants about Healthcare, Education, etc.

8. Career Stuff, Training, Salary Surveys

9. Miscellaneous

10. DSC Webinar Series – with video access

Tags: , , ,

17 short tutorials all data scientists should read

Excelentes textos fundamentais para cientistas dos dados

Excelentes textos fundamentais para cientistas dos dados

Here’s the list:

Related linkThe Data Science Toolkit

Tags: , , , , ,

Tipos de recursos do Project

Explica os tipos de recursos do MS. Project

Explica os tipos de recursos do MS. Project

Tipos de recursos do Project – trabalho, material e custo. Temos visto em recentes artigos aqui no Blogtek aspectos ligados aos cuidados de configuração antes de iniciar o cadastramento das tarefas, a custos, a calendários, e hoje veremos como podem ser configurados os tipos de recursos do Project.

Tags:

Docear – The Academic Literature Suite

Excelente software de gestão documental para escrever teses \ artigos

Excelente software de gestão documental para escrever teses \ artigos

Docear is a unique solution to academic literature management, i.e. it helps you organizing, creating, and discovering academic literature. Among others, Docear offers:

  1. A single-section user-interface that allows the most comprehensive organization of your literature. With Docear, you can sort documents into categories; you can sort annotations (comments, bookmarks, and highlighted text from PDFs) into categories; you can sort annotations within PDFs; and you can view multiple annotations of multiple documents, in multiple categories – at once.
  2. A ‘literature suite concept‘ that combines several tools in a single application (pdf management, reference management, mind mapping, …). This allows you to draft your own papers, assignments, thesis, etc. directly in Docear and copy annotations and references from your collection directly into your draft.
  3. recommender system that helps you to discover new literature: Docear recommends papers which are free, in full-text, instantly to download, and tailored to your information needs.

And did we mention that Docear is free, open source, available for Windows, Linux, and Mac OS X, and not evil?

Tags: , ,

Apache Spark

Uma alternativa ao Hadoop para computação com dados em memória

Uma alternativa ao Hadoop para computação com dados em memória

What is Apache Spark?

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write.

To run programs faster, Spark offers a general execution model that can optimize arbitrary operator graphs, and supports in-memory computing, which lets it query data faster than disk-based engines like Hadoop.

To make programming faster, Spark provides clean, concise APIs in Scala, Java and Python. You can also use Spark interactively from the Scala and Python shells to rapidly query big datasets.

What can it do?

Spark was initially developed for two applications where placing data in memory helps: iterative algorithms, which are common in machine learning, and interactive data mining. In both cases, Spark can run up to 100x faster than Hadoop MapReduce. However, you can use Spark for general data processing too. Check out our example jobs.

Spark is also the engine behind Shark, a fully Apache Hive-compatible data warehousing system that can run 100x faster than Hive.

While Spark is a new engine, it can access any data source supported by Hadoop, making it easy to run over existing data.

Tags: , , ,

Big Data or Pig Data?

Um conto sobre a necessidade de conhecimento de domínio (teoria)

Um conto sobre a necessidade de conhecimento de domínio (teoria)

(A fable on huge amounts of data and why we don’t need models)

There was a pig who wanted to be a scientist. He was not interested in models. When asked how he planned on making sense of the world, the pig would say in a deep mysterious voice, “I don’t do models: the world is my model” and then with a twinkle in his eyes, look at his interlocutor smugly.

By his phrase, “I don’t do models, the world is my model”, he meant that the world’s data was enough for him, the pig scientist. The more the data, the more accurately the pig declared, he would be able to predict what might happen in the world.

Tags: , ,

Brainstorm

Bom texto sobre uma técnica de captura de conhecimento tácito

Bom texto sobre uma técnica de captura de conhecimento tácito

Brainstorm, ou ainda Brainstorming, significa literalmente “tempestade de ideias”. No Brasil, por vezes é jocosamente denominado “toró de parpites”. É uma técnica criativa para obter ideias e soluções. De tão simples que é, muitas vezes é aplicada de forma inadequada, simplesmente como se fosse um bate-papo. Iremos ver aqui no Blogtek algumas técnicas para a busca de soluções de problemas.

Brainstorm – definição e aplicações

Brainstorm – princípios

Brainstorm – regras

Brainstorm – etapas


Tags: , ,

The Field Guide to Data Science

Bom e-book sobre data science

Bom e-book sobre data science

Data Science is the competitive advantage of the future for organizations interested in turning their data into a product through analytics. Industries from health, to national security, to finance, to energy can be improved by creating better data analytics through Data Science. The winners and the losers in the emerging data economy are going to be determined by their Data Science teams.

Booz Allen Hamilton created The Field Guide to Data Science to help organizations of all types and missions understand how to make use of data as a resource. The text spells out what Data Science is and why it matters to organizations as well as how to create Data Science teams. Along the way, our team of experts provides field-tested approaches, personal tips and tricks, and real-life case studies. Senior leaders will walk away with a deeper understanding of the concepts at the heart of Data Science. Practitioners will add to their toolboxes.

Tags: , , ,

Um bom texto sobre erros cometidos por profissionais no uso da estatística

Um bom texto sobre erros cometidos por profissionais no uso da estatística

Alex Reinhart, a PhD statistics student at Carnegie Mellon University, covers some of the common analysis mistakes in Statistics Done Wrong.

Statistics Done Wrong is a guide to the most popular statistical errors and slip-ups committed by scientists every day, in the lab and in peer-reviewed journals. Many of the errors are prevalent in vast swathes of the published literature, casting doubt on the findings of thousands of papers. Statistics Done Wrong assumes no prior knowledge of statistics, so you can read it before your first statistics course or after thirty years of scientific practice.

The text is available for free online, and there’s a physical book version on the way.

Tags: , , ,