Create a barebones R package from scratch
Posted by Armando Brito Mendes | Filed under estatística, software
While we’re on an R kick, Hilary Parker described how to create an R package from scratch, not just to share code with others but to save yourself some time on future projects. It’s not as hard as it seems.
This tutorial is not about making a beautiful, perfect R package. This tutorial is about creating a bare-minimum R package so that you don’t have to keep thinking to yourself, “I really should just make an R package with these functions so I don’t have to keep copy/pasting them like a goddamn luddite.” Seriously, it doesn’t have to be about sharing your code (although that is an added benefit!). It is about saving yourself time. (n.b. this is my attitude about all reproducibility.)
I need to do this. I’ve been meaning to wrap everything up for a while now, but it seemed like such a chore. Sometimes I’d even go back to my own tutorials for some copy and paste action. Now I know better. And that’s half the battle.
Tags: data mining, R-software, software estatístico
Using R in Nonparametric Statistical Analysis
Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, software
- Using R in Nonparametric Statistical Analysis: The Kruskall-Wallace Test for One-Way Analysis of Variance
- Using R in Nonparametic Statistical Analysis: The Binomial Sign Test
Tags: desnvolvimento de software, R-software, software estatístico
Why use R? Five reasons
Posted by Armando Brito Mendes | Filed under materiais para profissionais, software
Why use R? Five reasons.
In this post I will go through 5 reasons: zero cost, crazy popularity, awesome power, dazzling flexibility, and mind-blowing support. I believe R is the best statistical programming language to learn. As a blogger who has contributed over 150 posts in Stata and over 100 in R I have extensive experience with both a proprietary statistical programming language as well as the open source alternative. In my graduate career I have also had the opportunity to experiment with the proprietary software SPSS, SAS, Mathematica, as well as MPlus.
Tags: big data, definição, R-software, software estatístico
SPSS Internet Resources
Posted by Armando Brito Mendes | Filed under estatística, software
The SPSS Inc website
SPSS are now owned by IBM. The following links lead to the appropriate IBM pages now.
http://www-01.ibm.com/software/analytics/spss/ The home page of the SPSS Inc. website
http://www-01.ibm.com/software/uk/analytics/spss/ SPSS Inc. UK page
(If at some future time SPSS Inc change the structure of their website, you may find that only the first of the above links still works.)
The ASSESS-NEWS list
http://www.jiscmail.ac.uk/lists/assess-news.html Information about it, and an archive of past messages.
Other useful links
news:comp.soft-sys.stat.spss The SPSS newsgroup (this carries fairly heavy traffic).
Tags: IBM SPSS Statistics, software estatístico
SPSS Macros on the Internet
Posted by Armando Brito Mendes | Filed under estatística, materiais para profissionais, software
What sources of SPSS macros are available on the Internet?
Here are a few that I know about; I hope other people will tell us about ones that should be listed but aren’t.
An obvious starting point is SPSS Inc’s own Macro Library at http://www.spss.com/tech/stat/macros/ (it doesn’t contain very many, though, and they are statistical rather than utilities). If you are planning to adapt or write macros, it’s also worth seeing what’s in SPSS Inc’s AnswerNet Solutions. Go to http://www.spss.com/tech/answer/, specify Product; SPSS Base and Free Text: macro, then click on the page’s Search button.Raynald Levesque’s site http://pages.infinit.net/rlevesqu/ includes many pages on macros (including examples and some tutorial materials). But you should also look at the examples in his pages on syntax, as some of these are based on macros.
Newsgroups are also a useful source of macros. Searches of their archives can be very rewarding if you can get your search terms right (see our Other Internet Resources page).
Confidence intervals for proportions, differences between proportions and related quantities. See Dr Robert G. Newcombe’s home page at http://www.uwcm.ac.uk/uwcm/ms/Robert.html. Note that these are SPSS programs rather than macros, despite being described as macros by the author.
Polytomous logistic regression (of particular interest to users of SPSS 8.0 and earlier). For macros by John Hendrickx and Prof. Dr. Steffen Kühnel see http://www.sls.wau.nl/bk/bedrijfskunde/jhendrickx/spss/mlogist/
Regression: evaluating collinearity in models with interactions or non-linear terms. For a macro by Ben Pelzer, Manfred te Grotenhuis, Jan Lammers, John Hendrickx, see http://www.sls.wau.nl/bk/bedrijfskunde/jhendrickx/spss/perturb/perturb.html
Tags: Estat Descritiva, IBM SPSS Statistics, inferência, software estatístico
Read Histograms and Use Them in R
Posted by Armando Brito Mendes | Filed under estatística, materiais para profissionais, visualização
How to Read Histograms and Use Them in R
The histogram is one of my favorite chart types, and for analysis purposes, I probably use them the most. Devised by Karl Pearson (the father of mathematical statistics) in the late 1800s, it’s simple geometrically, robust, and allows you to see the distribution of a dataset.
If you don’t understand what’s driving the chart though, it can be confusing, which is probably why you don’t see it often in general publications.
Tags: análise de dados, data mining, Estat Descritiva, R-software, software estatístico
IFORS Education Resources Project
Posted by Armando Brito Mendes | Filed under estatística, Investigação Operacional, SAD - DSS, software
Welcome to the International Federation of Operational Research Societies (IFORS) Education Resources Project
- Main Page (19:13, 3 December 2013)
- Biased Random-Key Genetic Algorithms: A Tutorial (21:57, 2 December 2013)
- The Discrete Event System Specification Formalism (19:59, 2 December 2013)
- Urban Operations Research (01:36, 2 December 2013)
- Stochastic Models for Design and Planning (01:34, 2 December 2013)
- Queueing Theory Books Online (01:31, 2 December 2013)
- Practical Queueing Theory in Java (01:31, 2 December 2013)
- Explore Queueing Theory for Scheduling, Resource Allocation and Traffic Flow Applications (01:28, 2 December 2013)
- Stochastic Processes Course Notes (01:26, 2 December 2013)
- Test Problems for Non-Linear Programming (01:23, 2 December 2013)
- OR Notes: Separable Programming (01:21, 2 December 2013)
- OR Notes: Non-Linear Programming (01:20, 2 December 2013)
Tags: decisao em grupo, decisão médica, otimização, previsão, problemas, programação em folha de cálculo, software de otimização, software estatístico
How many statisticians does it take to split a bill?
Posted by Armando Brito Mendes | Filed under estatística, software
stas
Some thoughts on the Fall term, now that Spring is well under way [edit: added a few more points]:
- RMarkdown and knitr are amazing. When I next teach a course using R, my students will be turning in homeworks using these tools: The output immediately shows whether the code runs and what its results are. This is much better than students copying and pasting possibly-broken code and unconnected output into a text file or (gasp) Word document.
- I’m glad my cohort socializes outside the office, taking each other out for birthday lunches or going to see a Pirates game. Some of the older PhD students are so focused on their thesis work that they don’t take time for a social break, and I’d like to avoid getting stuck in that rut.
However! Our lunches always lead us back to the age old question: How many statisticians does it take to split a bill? Answer: too long. I threw together a Shiny app, DinneR, to help us answer this question.
Tags: big data, data mining, R-software, software estatístico
Using Dates and Times in R
Posted by Armando Brito Mendes | Filed under estatística, software
Today at the Davis R Users’ Group, Bonnie Dixon gave a tutorial on the various ways to handle dates and times in R. Bonnie provided this great script which walks through essential classes, functions, and packages. Here it is piped throughknitr::spin
. The original R script can be found as a gist here.
Tags: data mining, R-software, software estatístico
Learn R interactively with the swirl package
Posted by Armando Brito Mendes | Filed under estatística, materiais ensino, software
swirl is a software package for the R statistical programming language. Its purpose is to teach users statistics and R simultaneously and interactively.
Tags: data mining, desnvolvimento de software, R-software, software estatístico