Welcome
Everything on the site is free, and all of it is just me sharing the knowledge I collected over the years. It really helps if you would subsribe to my YouTube channel, or if you feel very generous you can do a small (or large :D) donation via Patreon.
What is it you like to do/know? (click on the option to see the 'answer')
I want to know how to analyse my data
You need to know which variable you want to analyse or variables that are involved in the analyses you want to do. Then determine the measurement level of the variable(s) (binary, nominal, ordinal or scale). For each measurement level or combination the site has instructions on what and how to, get an impression of the data, a visualization, the (post-hoc) test to perform, effect size, and write the results. You can select the situation you have from the top menu (how many variables are involved -> which measurement level(s). Then the steps will appear on the menu on the left.
I want to learn Statistics
This site is not intended as a Statistics course, more as a quick reference guide. However, you could get a good start by first going in the top menu to ‘Fundamentals’, and go over each section in there. Then you should be able to follow any other section.
I am looking for more information about a specific term in Statistics
You could use either the index (incomplete at the moment), or use the search option in the top menu.
I’m looking for the ThesisTools Pro tool
You can find it here
I want to understand the structure of this site
The site is split into a few sections. At Fundamentals some basic terms connected to statistics are briefly explained. A good understanding of these is needed to understand the rest of the site. This would be the place to start if you are not familiar with terms like sample, population, measurement level and significance. There is also a subsection at Fundamentals on the basics of SPSS. Note that for all analyses I try to add videos showing how to perform the analyses with SPSS, R and Excel.
At 'One variable' a sub entry is available for each possible type of variable. In each entry it is shown how you can perform a full analysis of it.
The same goes for Two variables - unpaired and Two variables - paired. The unpaired analysis is often used to show if there is an association/relation/correlation between two variables, or if one variable might effect the other. The paired version is if you are interested in differences between two variables or if people changed.
At 3+ variables - paired, you'll find out more about analysing multiple variables if for example you had a question where respondents could pick more than one answer, or if you had a set of questions all using the same scale (Likert items).
The fundamentals section is required to understand for all other parts, but after that you can jump to any section you like.
What are your thoughts on the different programs you use on this site?
MS Excel (paid but often already installed)
Microsoft Excel probably doesn't need an introduction. It is the spreadsheet software from Microsoft. Although originally spreadsheet programs were used for financial and accounting purposes, they are often used for data analysis in general.
They come equiped with some build in functions for some basic statistics, but for more sophisticated techniques they require additional work. The advantage of using MS Excel is that it is commonly used already, and doesn't require programming skills.
Flowgorithm (free but more just for learning)
This is a program that allows you to make flow-charts of your software program and can export it to a variety of different programming languages: Ada 95, AppleScript, Bash, C#, C++, Fortran 2003, Java, JavaScript, Lua, MATLAB, Nim, Pascal, Perl, PHP, Powershell, Python, QBasic, Ruby, Scala, Smalltalk, Swift, Transact-SQL, TypeScript, VBA, and. Visual Basic .NET. It can also generate Auto, Gaddis and IBO pseudocode.
This is great for learning procedural programming, and understanding the math behind the various statistical calculations.
Python (free - requires programming)
A programming language that is often listed as one of the easiest languages to learn. It is also often used in data mining and has many libraries available that can make the statistical calculations a lot easier
Unlike Excel, there are also many functions in Python (via libraries) to avoid having you to do the calculations by following the formulas. The down side is of course you will need to learn a bit about programming.
R (studio) (free - requires programming)
Similar as Python, but R was originally build for statistical calculations. In my opinion if you are only interested in statistical calculations R might be better, but if you are also interested in programming go with Python. That's just my opinion and oh boy are there many opinions on this on the internet...
SPSS Statistics (paid - no programming)
Currently owned by IBM this program is designed for Statistics. Unlike Python and R, it has a graphical user interface (GUI), so no need to learn programming, it is a matter of point-and-click. Now SPSS can do a lot more sophisticated statistics than Excel can, but it doesn't have all the statistical tests. If the GUI doesn't have it, there are sometimes ways to still get it using SPSS syntax or use R connection, but in my opinion you could then just as well use R itself.
I just want to browse around
Sure, have fun
Overview
Below an overview of the various tests
Var1 | Var2 | version | Example question(s) | Visualisation | Test | Post-hoc | Effect size |
binary | Would the percentages for the two categories be equal in the population? | n.a. | one-sample binomial test | n.a. | Cohen's g | ||
nominal | Would the percentages for each category be equal in the population? | bar-chart | Pearson chi-square goodness of fit | pairwise binomial test with Bonferroni adjustment | Cramer's V | ||
ordinal | Did at least 50% of the respondents thought it was good (or very good)? | dual-axis bar chart | one-sample Wilcoxon signed rank test | n.a. | Rosenthal correlation | ||
scale | Is the average age different from ….? | histogram | one-sample t-test | n.a. | Cohen's D | ||
binary | ordinal | unpaired | Does gender have an influence on opinion about activities? | Compound bar-chart | Mann-Whitney U | n.a. | Rosenthal Correlation |
binary | scale | unpaired | Does gender have an influence on grade? | Split histogram | independent samples t-test | n.a. | Cohen's Ds |
nominal | nominal | unpaired | Does gender have an influence on favourite sport? Is there an association (relation) between gender and favourite sport? |
Clustered bar-chart | Pearson chi-square test of independence | Adjusted residuals z-test | Cramer's V |
nominal | nominal | paired | Did the overall distribution of favourite brand before and after seeing a commercial from each brand changed? | Clustered bar-chart | Bhapkar test | pairwise McNemar test with Bonferroni adjustment | |
nominal | ordinal | unpaired | Does location have an influence on opinion about….? Are there differences between the different locations on the opinion about….? |
Compound bar-chart | Kruskal-Wallis H test | Dunn's test | epsilon square |
nominal | scale | unpaired | Does location have an influence on the grade? Are there differences between the different locations and the grades? |
split-histogram | one-way ANOVA | pairwise t-test with Bonferroni or Games-Howell adjustment | eta square |
ordinal | ordinal | unpaired | Is there a relations/association/correlation between the opinion about …. and the opinion about….? | heat-map | Goodman-Kruskal Gamma | n.a. | Goodman-Kruskal Gamma |
ordinal | ordinal | paired | Did people's opinion change on a brand after seeing a commercial compared to before the commercial? | Compound bar-chart | Wilcoxon signed rank test | n.a. | correlation coefficient |
ordinal | scale | unpaired | Is there a relations/association/correlation between the opinion about …. and the grade? | split-histogram | Spearman's rho | n.a. | Spearman's rho |
scale | scale | unpaired | Is there a relations/association/correlation between the grade for …. and the grade for…..? | scatterplot | Pearson correlation | n.a. | Determination coefficient |
scale | scale | paired | Did people's income change before vs after they followed a course? | scatterplot | paired samples t-test | n.a. | Cohen's D |
Menu
Google adds