visualização do intervalo de confiança

Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, visualização

clicar na imagem para seguir o link

Boa forma de visualizar o conceito de Intervalo de Confiança Aleatório.

About the visualization

Some say that a shift from hypothesis testing to confidence intervals and estimation will lead to fewer statistical misinterpretations. Personally, I am not sure about that. But I agree with the sentiment that we should stop reducing statistical analysis to binary decision-making. The problem with CIs is that they are as unintuitive and as misunderstood p-values and null hypothesis significance testing. Moreover, CIs are often used to perform hypothesis tests and are therefore prone to the same misuses as p-values.

Tags: belo, definição, inferência

Read more | Comments off | June 29th, 2015

Rtips. Revival 2014!

Posted by Armando Brito Mendes | Filed under estatística, matemática, software

Uma animação com todos os lugares referidos numa canção de johnny cash

Montes de exemplos de R numa única longa página.

Table of Contents

Section: Original Preface

Section 1: Data Input/Output

Subsection 1.1: Bring raw numbers into R (05/22/2012)

Subsection 1.2: Basic notation on data access (12/02/2012)

Subsection 1.3: Checkout the new Data Import/Export manual (13/08/2001)

Subsection 1.4: Exchange data between R and other programs (Excel, etc) (01/21/2009)

Subsection 1.5: Merge data frames (04/23/2004)

Subsection 1.6: Add one row at a time (14/08/2000)

Subsection 1.7: Need yet another different kind of merge for data frames (11/08/2000)

Subsection 1.8: Check if an object is NULL (06/04/2001)

Subsection 1.9: Generate random numbers (12/02/2012)

Subsection 1.10: Generate random numbers with a fixed mean/variance (06/09/2000)

Subsection 1.11: Use rep to manufacture a weighted data set (30/01/2001)

Subsection 1.12: Convert contingency table to data frame (06/09/2000)

Subsection 1.13: Write: data in text file (31/12/2001)

Section 2: Working with data frames: Recoding, selecting, aggregating

Subsection 2.1: Add variables to a data frame (or list) (02/06/2003)

Subsection 2.2: Create variable names on the fly (10/04/2001)

Subsection 2.3: Recode one column, output values into another column (12/05/2003)

Subsection 2.4: Create indicator (dummy) variables (20/06/2001)

Subsection 2.5: Create lagged values of variables for time series regression (05/22/2012)

Subsection 2.6: How to drop factor levels for datasets that don’t have observations with those values? (08/01/2002)

Subsection 2.7: Select/subset observations out of a dataframe (08/02/2012)

Subsection 2.8: Delete first observation for each element in a cluster of observations (11/08/2000)

Subsection 2.9: Select a random sample of data (11/08/2000)

Subsection 2.10: Selecting Variables for Models: Don’t forget the subset function (15/08/2000)

Subsection 2.11: Process all numeric variables, ignore character variables? (11/02/2012)

Subsection 2.12: Sorting by more than one variable (06/09/2000)

Subsection 2.13: Rank within subgroups defined by a factor (06/09/2000)

Subsection 2.14: Work with missing values (na.omit, is.na, etc) (15/01/2012)

Subsection 2.15: Aggregate values, one for each line (16/08/2000)

Subsection 2.16: Create new data frame to hold aggregate values for each factor ()

Subsection 2.17: Selectively sum columns in a data frame (15/01/2012)

Subsection 2.18: Rip digits out of real numbers one at a time (11/08/2000)

Subsection 2.19: Grab an item from each of several matrices in a List (14/08/2000)

Subsection 2.20: Get vector showing values in a dataset (10/04/2001)

Subsection 2.21: Calculate the value of a string representing an R command (13/08/2000)

Subsection 2.22: Which can grab the index values of cases satisfying a test ()

Subsection 2.23: Find unique lines in a matrix/data frame (31/12/2001)

Section 3: Matrices and vector operations

Subsection 3.1: Create a vector, append values (01/02/2012)

Subsection 3.2: How to create an identity matrix? (16/08/2000)

Subsection 3.3: Convert matrix m to one long vector (11/08/2000)

Subsection 3.4: Creating a peculiar sequence (1 2 3 4 1 2 3 1 2 1) (11/08/2000)

Subsection 3.5: Select every n’th item (14/08/2000)

Subsection 3.6: Find index of a value nearest to 1.5 in a vector (11/08/2000)

Subsection 3.7: Find index of nonzero items in vector (18/06/2001)

Subsection 3.8: Find index of missing values (15/08/2000)

Subsection 3.9: Find index of largest item in vector (16/08/2000)

Subsection 3.10: Replace values in a matrix (22/11/2000)

Subsection 3.11: Delete particular rows from matrix (06/04/2001)

Subsection 3.12: Count number of items meeting a criterion (01/05/2005)

Subsection 3.13: Compute partial correlation coefficients from correlation matrix ()

Subsection 3.14: Create a multidimensional matrix (R array) (20/06/2001)

Subsection 3.15: Combine a lot of matrices (20/06/2001)

Subsection 3.16: Create neighbor matrices according to specific logics (20/06/2001)

Subsection 3.17: Matching two columns of numbers by a key variable (20/06/2001)

Subsection 3.18: Create Upper or Lower Triangular matrix (06/08/2012)

Subsection 3.19: Calculate inverse of X (12/02/2012)

Subsection 3.20: Interesting use of Matrix Indices (20/06/2001)

Subsection 3.21: Eigenvalues example (20/06/2001)

Section 4: Applying functions, tapply, etc

Subsection 4.1: Return multiple values from a function (12/02/2012)

Subsection 4.2: Grab ‘‘p’’ values out of a list of significance tests (22/08/2000)

Subsection 4.3: ifelse usage (12/02/2012)

Subsection 4.4: Apply to create matrix of probabilities, one for each cell (14/08/2000)

Subsection 4.5: Outer. (15/08/2000)

Subsection 4.6: Check if something is a formula/function (11/08/2000)

Subsection 4.7: Optimize with a vector of variables (11/08/2000)

Subsection 4.8: slice.index, like in S+ (14/08/2000)

Section 5: Graphing

Subsection 5.1: Adjust features with par before graphing (18/06/2001)

Subsection 5.2: Save graph output (03/21/2014)

Subsection 5.3: How to automatically name plot output into separate files (10/04/2001)

Subsection 5.4: Control papersize (15/08/2000)

Subsection 5.5: Integrating R graphs into documents: LaTeX and EPS or PDF (20/06/2001)

Subsection 5.6: ‘‘Snapshot’’ graphs and scroll through them (31/12/2001)

Subsection 5.7: Plot a density function (eg. Normal) (22/11/2000)

Subsection 5.8: Plot with error bars (11/08/2000)

Subsection 5.9: Histogram with density estimates (14/08/2000)

Subsection 5.10: How can I ‘‘overlay’’ several line plots on top of one another? (09/29/2005)

Subsection 5.11: Create ‘‘matrix’’ of graphs (18/06/2001)

Subsection 5.12: Combine lines and bar plot? (07/12/2000)

Subsection 5.13: Regression scatterplot: add fitted line to graph (03/20/2014)

Subsection 5.14: Control the plotting character in scatterplots? (11/08/2000)

Subsection 5.15: Scatterplot: Control Plotting Characters (men vs women, etc)} (11/11/2002)

Subsection 5.16: Scatterplot with size/color adjustment (12/11/2002)

Subsection 5.17: Scatterplot: adjust size according to 3rd variable (06/04/2001)

Subsection 5.18: Scatterplot: smooth a line connecting points (02/06/2003)

Subsection 5.19: Regression Scatterplot: add estimate to plot (18/06/2001)

Subsection 5.20: Axes: controls: ticks, no ticks, numbers, etc (22/11/2000)

Subsection 5.21: Axes: rotate labels (06/04/2001)

Subsection 5.22: Axes: Show formatted dates in axes (06/04/2001)

Subsection 5.23: Axes: Reverse axis in plot (12/02/2012)

Subsection 5.24: Axes: Label axes with dates (11/08/2000)

Subsection 5.25: Axes: Superscript in axis labels (11/08/2000)

Subsection 5.26: Axes: adjust positioning (31/12/2001)

Subsection 5.27: Add ‘‘error arrows’’ to a scatterplot (30/01/2001)

Subsection 5.28: Time Series: how to plot several ‘‘lines’’ in one graph? (06/09/2000)

Subsection 5.29: Time series: plot fitted and actual data (11/08/2000)

Subsection 5.30: Insert text into a plot (22/11/2000)

Subsection 5.31: Plotting unbounded variables (07/12/2000)

Subsection 5.32: Labels with dynamically generated content/math markup (16/08/2000)

Subsection 5.33: Use math/sophisticated stuff in title of plot (11/11/2002)

Subsection 5.34: How to color-code points in scatter to reveal missing values of 3rd variable? (15/08/2000)

Subsection 5.35: lattice: misc examples (12/11/2002)

Subsection 5.36: Make 3d scatterplots (11/08/2000)

Subsection 5.37: 3d contour with line style to reflect value (06/04/2001)

Subsection 5.38: Animate a Graph! (13/08/2000)

Subsection 5.39: Color user-portion of graph background differently from margin (06/09/2000)

Subsection 5.40: Examples of graphing code that seem to work (misc) (11/16/2005)}

Section 6: Common Statistical Chores

Subsection 6.1: Crosstabulation Tables (01/05/2005)

Subsection 6.2: t-test (18/07/2001)

Subsection 6.3: Test for Normality (31/12/2001)

Subsection 6.4: Estimate parameters of distributions (12/02/2012)

Subsection 6.5: Bootstrapping routines (14/08/2000)

Subsection 6.6: BY subgroup analysis of data (summary or model for subgroups)(06/04/2001)

Section 7: Model Fitting (Regression-type things)

Subsection 7.1: Tips for specifying regression models (12/02/2002)

Subsection 7.2: Summary Methods, grabbing results inside an output object

Subsection 7.3: Calculate separate coefficients for each level of a factor (22/11/2000)

Subsection 7.4: Compare fits of regression models (F test subset B’s =0) (14/08/2000)

Subsection 7.5: Get Predicted Values from a model with predict() (11/13/2005)

Subsection 7.6: Polynomial regression (15/08/2000)

Subsection 7.7: Calculate p value for an F stat from regression (13/08/2000)

Subsection 7.8: Compare fits (F test) in stepwise regression/anova (11/08/2000)

Subsection 7.9: Test significance of slope and intercept shifts (Chow test?)

Subsection 7.10: Want to estimate a nonlinear model? (11/08/2000)

Subsection 7.11: Quasi family and passing arguments to it. (12/11/2002)

Subsection 7.12: Estimate a covariance matrix (22/11/2000)

Subsection 7.13: Control number of significant digits in output (22/11/2000)

Subsection 7.14: Multiple analysis of variance (06/09/2000)

Subsection 7.15: Test for homogeneity of variance (heteroskedasticity) (12/02/2012)

Subsection 7.16: Use nls to estimate a nonlinear model (14/08/2000)

Subsection 7.17: Using nls and graphing things with it (22/11/2000)

Subsection 7.18: and hypo tests (22/11/2000)

Subsection 7.19: logistic regression with repeated measurements (02/06/2003)

Subsection 7.20: Logit (06/04/2001)

Subsection 7.21: Random parameter (Mixed Model) tips (01/05/2005)

Subsection 7.22: Time Series: basics (31/12/2001)

Subsection 7.23: Time Series: misc examples (10/04/2001)

Subsection 7.24: Categorical Data and Multivariate Models (04/25/2004)

Subsection 7.25: Lowess. Plot a smooth curve (04/25/2004)

Subsection 7.26: Hierarchical/Mixed linear models. (06/03/2003)

Subsection 7.27: Robust Regression tools (07/12/2000)

Subsection 7.28: Durbin-Watson test (10/04/2001)

Subsection 7.29: Censored regression (04/25/2004)

Section 8: Packages

Subsection 8.1: What packages are installed on Paul’s computer?

Subsection 8.2: Install and load a package

Subsection 8.3: List Loaded Packages

Subsection 8.4: Where is the default R library folder? Where does R look for packages in a computer?

Subsection 8.5: Detach libraries when no longer needed (10/04/2001)

Section 9: Misc. web resources

Subsection 9.1: Navigating R Documentation (12/02/2012)

Subsection 9.2: R Task View Pages (12/02/2012)

Subsection 9.3: Using help inside R(13/08/2001)

Subsection 9.4: Run examples in R (10/04/2001)

Section 10: R workspace

Subsection 10.1: Writing, saving, running R code (31/12/2001)

Subsection 10.2: .RData, .RHistory. Help or hassle? (31/12/2001)

Subsection 10.3: Save & Load R objects (31/12/2001)

Subsection 10.4: Reminders for object analysis/usage (11/08/2000)

Subsection 10.5: Remove objects by pattern in name (31/12/2001)

Subsection 10.6: Save work/create a Diary of activity (31/12/2001)

Subsection 10.7: Customized Rprofile (31/12/2001)

Section 11: Interface with the operating system

Subsection 11.1: Commands to system like change working directory (22/11/2000)

Subsection 11.2: Get system time. (30/01/2001)

Subsection 11.3: Check if a file exists (11/08/2000)

Subsection 11.4: Find files by name or part of a name (regular expression matching) (14/08/2001)

Section 12: Stupid R tricks: basics you can’t live without

Subsection 12.1: If you are asking for help (12/02/2012)

Subsection 12.2: Commenting out things in R files (15/08/2000)

Section 13: Misc R usages I find interesting

Subsection 13.1: Character encoding (01/27/2009)

Subsection 13.2: list names of variables used inside an expression (10/04/2001)

Subsection 13.3: R environment in side scripts (10/04/2001)

Subsection 13.4: Derivatives (10/04/2001)

Tags: análise de dados, Estat Descritiva, inferência, R-software, software estatístico

Read more | Comments off | May 30th, 2015

Rice Virtual Lab in Statistics

Posted by Armando Brito Mendes | Filed under estatística, materiais ensino

clique na imagem para seguir o link

Referências úteis para conceitos de estatística básica.

	HyperStat Online An online statistics book with links to other statistics resources on the web.
	Simulations/Demonstrations Java applets that demonstrate various statistical concepts.
	Case Studies Examples of real data with analyses and interpretation
	Analysis Lab Some basic statistical analysis tools.

Tags: Estat Descritiva, inferência

Read more | Comments off | May 29th, 2015

SticiGui – online statistics book

Posted by Armando Brito Mendes | Filed under estatística, materiais ensino

clique na imagem para seguir a ligação

Tags: Estat Descritiva, inferência

Read more | Comments off | January 19th, 2015

Tutorial: How to detect spurious correlations

Posted by Armando Brito Mendes | Filed under estatística, materiais ensino

Uso de métodos robustos para identiicar correlações espúrias

Tutorial: How to detect spurious correlations, and how to find the real ones

Specifically designed in the context of big data in our research lab, the new and simple strong correlation synthetic metric proposed in this article should be used, whenever you want to check if there is a real association between two variables, especially in large-scale automated data science or machine learning projects. Use this new metric now, to avoid being accused of reckless data science and even being sued for wrongful analytic practice.

Tags: data mining, Estat Descritiva, inferência

Read more | Comments off | August 12th, 2014

khanacademy: Lei dos grandes números

Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, videos

Boas aulas de video em pt sobre a lei dos grandes números

Vídeo original: Law of Large Numbers (https://www.khanacademy.org/math/probability/random-variables-topic/random_variables_prob_dist/v/law-of-large-numbers) A Khan Academy Portugal disponibiliza explicações online de Matemática gratuitas desde o 1º até ao 12º ano de escolaridade. Este vídeo foi produzido pela Khan Academy e traduzido para português pela Fundação Portugal Telecom (ver todos os vídeos disponíveis em http://fundacao.telecom.pt/khanacademy).

Tags: inferência