Open source: Difference between revisions

From psychmethods
No edit summary
 
(4 intermediate revisions by the same user not shown)
Line 7: Line 7:
'''Exchanging your data:''' Most of us don't care any more because we are so used to share our data - voluntarily - on places like Google or Facebook. However, especially text documents might contain information that is private and were you - if you think about it - are not particularly happy to share it. Nowadays, Google and Facebook use the information that the gain from the data that you share with them for targeted advertisement which at first glance - less spam - is good for you. However, such information can also be used to tweak [https://en.wikipedia.org/wiki/Cambridge_Analytica#Elections elections] or [https://en.wikipedia.org/wiki/Russian_interference_in_the_2016_Brexit_referendum referendums], thereby threatening democracy. Furthermore, it is unclear what use can be made of your data in the future but is is impossible to get your data back once they are out. [https://www.pnas.org/content/112/4/1036 Already some years ago (published 2015)], it was possible to predict your position on the [https://en.wikipedia.org/wiki/Big_Five_personality_traits Big-Five-dimensions] (the predominant model for describing personality traits) using Facebook-likes - and technology developed rapidly since then.
'''Exchanging your data:''' Most of us don't care any more because we are so used to share our data - voluntarily - on places like Google or Facebook. However, especially text documents might contain information that is private and were you - if you think about it - are not particularly happy to share it. Nowadays, Google and Facebook use the information that the gain from the data that you share with them for targeted advertisement which at first glance - less spam - is good for you. However, such information can also be used to tweak [https://en.wikipedia.org/wiki/Cambridge_Analytica#Elections elections] or [https://en.wikipedia.org/wiki/Russian_interference_in_the_2016_Brexit_referendum referendums], thereby threatening democracy. Furthermore, it is unclear what use can be made of your data in the future but is is impossible to get your data back once they are out. [https://www.pnas.org/content/112/4/1036 Already some years ago (published 2015)], it was possible to predict your position on the [https://en.wikipedia.org/wiki/Big_Five_personality_traits Big-Five-dimensions] (the predominant model for describing personality traits) using Facebook-likes - and technology developed rapidly since then.


=Office-suite=
=Office-suite (and further software for creating documents)=
[https://www.libreoffice.org LibreOffice] is a free and open-source office suite, comprising of programs for word processing (Writer = [equivalent to] Word [in Microsoft Office]), the creation and editing of spreadsheets (Calc = Excel), slideshows (Impress = PowerPoint), diagrams and drawings (Draw), working with databases (Base = Access), and composing mathematical formulae (Math = Formaula Editor). It, overall, has similar functionality to programs included in Microsoft Office and (from version 6.2) a similar user interface, aiming to make the transition from Microsoft Office easy. LibreOffice is available for all major platforms, including Microsoft Windows, macOS, and Linux. In addition, there is a LibreOffice Viewer for Android, as well as an online office suite LibreOffice Online.<br>
[https://www.libreoffice.org LibreOffice] is a free and open-source office suite, comprising of programs for word processing (Writer = [equivalent to] Word [in Microsoft Office]), the creation and editing of spreadsheets (Calc = Excel), slideshows (Impress = PowerPoint), diagrams and drawings (Draw), working with databases (Base = Access), and composing mathematical formulae (Math = Formaula Editor). It, overall, has similar functionality to programs included in Microsoft Office and (from version 6.2) a similar user interface, aiming to make the transition from Microsoft Office easy. LibreOffice is available for all major platforms, including Microsoft Windows, macOS, and Linux. In addition, there is a LibreOffice Viewer for Android, as well as an online office suite LibreOffice Online.<br>
LibreOffice uses OpenDocument file format (ODF; an international ISO/IEC standard) as its native format to save documents for all of its applications. It supports the file formats of most other major office suites, including Microsoft Office. However, the import from and the export to Microsoft office doesn't always work flawlessly. This is - in my opinion - more a problem of the Office file format because it can't be either taken for granted that, e.g., a Word document created on a Mac opens identically on a Windows PC when using this format.<br>
LibreOffice uses OpenDocument file format (ODF; an international ISO/IEC standard) as its native format to save documents for all of its applications. It supports the file formats of most other major office suites, including Microsoft Office. However, the import from and the export to Microsoft office doesn't always work flawlessly. This is - in my opinion - more a problem of the Office file format because it can't be either taken for granted that, e.g., a Word document created on a Mac opens identically on a Windows PC when using this format.<br>
Alternatively, [[Latex]] can be used to create documents / manuscripts.<br>
There is also overview over [[Reference management|literature / reference management]] software with an open sources package - [[Reference management#Zotero|Zotero]] - and another package that is provided free-of-charge - [[Reference management#Mendeley|Mendeley]].


=Software to create and manipulate graphics=
=Software to create and manipulate graphics=
Line 19: Line 21:


=Statistical analyses=
=Statistical analyses=
[https://www.gnu.org/software/pspp/ PSPP] is intended as open source equivalent and replacement for the proprietary program SPSS (you can [https://www.gnu.org/software/pspp/get.html download] them free-of-charge). However, even though basic functionality is implemented, quite some more advanced methods (notably ANOVA for repeated measurements) are currently missing.<br><br>
[https://www.r-project.org/ R] is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. It is the most comprehensive and most advanced software package when it comes to statistical analyses. The disadvantage is that - even though there are graphical user interfaces - it is mainly command-line-based and therefore subject to a certain learning curve. However, you are rewarded for this effort by a wealth of included (and additional) libraries that provide a wide variety of statistical and graphical techniques, including linear and non-linear modelling, classical statistical tests, time-series analysis, classification, clustering, and others. The R community is very active in contributing or extending packages. These contributors are often experienced statisticians and therefore R and its libraries usually represent state-of-the-art-techniques for data analysis, visualization, etc.
[https://www.r-project.org/ R] is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. It is the most comprehensive and most advanced software package when it comes to statistical analyses. The disadvantage is that - even though there are graphical user interfaces - it is mainly command-line-based and therefore subject to a certain learning curve. However, you are rewarded for this effort by a wealth of included (and additional) libraries that provide a wide variety of statistical and graphical techniques, including linear and non-linear modelling, classical statistical tests, time-series analysis, classification, clustering, and others. The R community is very active in contributing or extending packages. These contributors are often experienced statisticians and therefore R and its libraries usually represent state-of-the-art-techniques for data analysis, visualization, etc.
R and its libraries are provided with open source licenses (meaning that you can [https://cloud.r-project.org/ download] them free-of-charge, can check the code and can adjust it to your needs).<br><br>
R and its libraries are provided with open source licenses (meaning that you can [https://cloud.r-project.org/ download] them free-of-charge, can check the code and can adjust it to your needs).<br><br>
[https://www.jamovi.org/ Jamovi] and [https://jasp-stats.org/ JASP] are graphical user interfaces that aim to provide a similar look-and-feel as the SPSS user interface (to ease transition). The development team for Jamovi was part of the JASP team before they split. As a consequence, both software packages are very like each other and provide generally very similar functionality. Advantages of JASP over Jamovi are (a) Bayes statistic is a central part of the JASP (however, it is possible to run Bayes analyses in Jamovi as well using an additional module); (b) APA-style formatting of output tables; (c) the option to export output tables in LaTeX format and (d) context-help (brief introduction what the statistical procedure does). Jamovi has the following advantages over JASP: (a) syntax can be used (i.e., you can take the analyses that you clicked together in the GUI, copy and adjust them so that you can directly use them in R); (b) in-program-data-editing (JASP opens data in an external spreadsheet-program like Excel or LibreOffice Calc); (c) JASP is more recent, therefore based on programming technology that wasn't available when JASP was written and therefore a bit snappier.<br>
 
The package can be downloaded free-of-charge: [https://www.jamovi.org/download.html Jamovi] and [https://jasp-stats.org/download/ JASP].<br><br>
However, while R certainly is the most powerful tool to conduct statistical analyses, it is (generally) command-line-based (i.e., you write commands which then conduct the analyses). This comes with a somewhat steep learning curve and most people therefore wish to have a graphical user interface. [https://www.jamovi.org/ jamovi] and [https://jasp-stats.org/ JASP] provide such graphical user interfaces and aim at the same time for a look-and-feel that is similar to the user interface in SPSS (to ease transition; since quite a number of academics is trained to use SPSS). Both software packages are very like each other and provide generally very similar functionality (the development team for jamovi was earlier a part of the JASP team before they split). Advantages of JASP over jamovi are (a) a stronger focus on Bayes statistic (however, it is possible to run Bayes analyses in jamovi as well using an additional module) and (b) context-help (brief introduction what the statistical procedure does). jamovi has the following advantages over JASP: (a) syntax can be used, i.e., you can take the analyses that you clicked together in the GUI, copy and adjust them so that you can directly use them in R and you can vice versa run most R-code within the Rj editor (an additional module for jamovi); (b) in-program-data-editing (JASP opens data in an external spreadsheet-program like Excel or LibreOffice Calc); (c) jamovi is more recent, therefore based on programming technology that wasn't available when JASP was written, which permits, e.g., to [https://cloud.jamovi.org/ run jamovi in your browser] and that it is easier to create modules that add functionality (further analyses) to jamovi.<br>
Danielle J. Navarro has written an excellent textbook covering a wide range of statistical methods that can also be downloaded for free: the original book was for [https://learningstatisticswithr.com/ R]; before it was adapted for [https://sites.google.com/brookes.ac.uk/learning-stats-with-jamovi Jamovi] (by David R. Foxcroft) and for [https://learnstatswithjasp.com/ JASP] (by David R. Foxcroft and Thomas J. Faulkenberry). Danielle also started working on a [https://djnavarro.github.io/tidylsrbook/ more comprehensive version] using R and [https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html tidy].<br><br>
The package can be downloaded free-of-charge: [https://www.jamovi.org/download.html jamovi] and [https://jasp-stats.org/download/ JASP].<br><br>
 
[https://www.gnu.org/software/pspp/ PSPP] is intended as open source equivalent and replacement for the proprietary program SPSS (you can [https://www.gnu.org/software/pspp/get.html download] them free-of-charge). However, even though basic functionality is implemented, quite some more advanced methods (notably ANOVA for repeated measurements) are currently missing.<br><br>
 
Danielle J. Navarro has written an excellent textbook covering a wide range of statistical methods that can also be downloaded for free: the original book was for [https://learningstatisticswithr.com/ R]; before it was adapted for [https://sites.google.com/brookes.ac.uk/learning-stats-with-jamovi jamovi] (by David R. Foxcroft) and for [https://learnstatswithjasp.com/ JASP] (by David R. Foxcroft and Thomas J. Faulkenberry). Danielle also started working on a [https://djnavarro.github.io/tidylsrbook/ more comprehensive version] using R and [https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html tidy].<br><br>
 
All software packages mentioned above are available for the three major operating systems (Windows, Mac OS X, Linux).<br><br>
All software packages mentioned above are available for the three major operating systems (Windows, Mac OS X, Linux).<br><br>

Latest revision as of 10:45, 16 August 2022

What does open source mean and why should you care about it?

In my opinion, there are four main reasons why open source software should be used whenever possible, for four main reasons:

Free as in free beer: This is a no-brainer - you can download the software packages free-of-charge. Companies that produce open source software typically provide additional services (typically to companies or other large organizations) to generate money - as a private person you are rarely asked to pay something for the software. However, donations supporting the development are surely always welcome.

Free as in freedom of speech: Why should the wheel be re-invented again and again? Let's assume that the implementation of making text bold on a text processor is quite similar in, e.g., Microsoft Word and LibreOffice Writer. The same applies to, e.g., SPSS and SAS who implement similar statistical procedures or methods but differently because they don't share their source code. Just think about the waste of time and human effort that causes. But, more importantly, the code quality can't be checked for proprietary software. That means, what is going on in between you clicked your analysis together and your output appears on the screen can't be checked or verified - you just have to trust the software provider. Even though most of us are not able to assess the quality of the source code there are people able to do this. As a consequence, open source software is less likely to have major bugs (software errors) and if there are they are likely to be fixed quicker than in proprietary software.

Vendor lock-in: This is a process a little bit like becoming addicted to a drug. Typically, switching from a software package means a lot of effort and efficient (or lazy) as we are do we try to avoid that. So, you often stay with what you have learned in school or university and the lucky software company that provided the software for your education (often) got a lifetime-customer. This (and not charity or benevolence) is why Microsoft offers free-of-charge licenses of Office365 to all UiB-employees and students. I suppose, most software companies listed here have the same aim. The smallest version of Microsoft Office costs currently kr 899,00 every year. You can do the maths yourself. However, it is not only that the switch from one software package to another demands resources, these companies developed something even more cunning (or efficient if you take their perspective) which is called vendor lock-in. That means that often there is no detailed information about the file format the software is storing its data in. That means, if you, e.g., try to open a Word-document (.docx) in another package it often doesn't appear perfectly the same as if you open it in Word and again if you send the file back after you edited it. This doesn't happen because, e.g., LibreOffice did a bad job when programming the import / export filters but because parts of the (e.g.) Word-format are not documented (and they only thing one could do is reverse engineering what went wrong whenever such a mistake is found - remember: you can't look into what, e..g., Microsoft did because their code is not public).

Exchanging your data: Most of us don't care any more because we are so used to share our data - voluntarily - on places like Google or Facebook. However, especially text documents might contain information that is private and were you - if you think about it - are not particularly happy to share it. Nowadays, Google and Facebook use the information that the gain from the data that you share with them for targeted advertisement which at first glance - less spam - is good for you. However, such information can also be used to tweak elections or referendums, thereby threatening democracy. Furthermore, it is unclear what use can be made of your data in the future but is is impossible to get your data back once they are out. Already some years ago (published 2015), it was possible to predict your position on the Big-Five-dimensions (the predominant model for describing personality traits) using Facebook-likes - and technology developed rapidly since then.

Office-suite (and further software for creating documents)

LibreOffice is a free and open-source office suite, comprising of programs for word processing (Writer = [equivalent to] Word [in Microsoft Office]), the creation and editing of spreadsheets (Calc = Excel), slideshows (Impress = PowerPoint), diagrams and drawings (Draw), working with databases (Base = Access), and composing mathematical formulae (Math = Formaula Editor). It, overall, has similar functionality to programs included in Microsoft Office and (from version 6.2) a similar user interface, aiming to make the transition from Microsoft Office easy. LibreOffice is available for all major platforms, including Microsoft Windows, macOS, and Linux. In addition, there is a LibreOffice Viewer for Android, as well as an online office suite LibreOffice Online.
LibreOffice uses OpenDocument file format (ODF; an international ISO/IEC standard) as its native format to save documents for all of its applications. It supports the file formats of most other major office suites, including Microsoft Office. However, the import from and the export to Microsoft office doesn't always work flawlessly. This is - in my opinion - more a problem of the Office file format because it can't be either taken for granted that, e.g., a Word document created on a Mac opens identically on a Windows PC when using this format.
Alternatively, Latex can be used to create documents / manuscripts.
There is also overview over literature / reference management software with an open sources package - Zotero - and another package that is provided free-of-charge - Mendeley.

Software to create and manipulate graphics

Pixel-based

GIMP

Vector-bases

Inkscape

Statistical analyses

R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. It is the most comprehensive and most advanced software package when it comes to statistical analyses. The disadvantage is that - even though there are graphical user interfaces - it is mainly command-line-based and therefore subject to a certain learning curve. However, you are rewarded for this effort by a wealth of included (and additional) libraries that provide a wide variety of statistical and graphical techniques, including linear and non-linear modelling, classical statistical tests, time-series analysis, classification, clustering, and others. The R community is very active in contributing or extending packages. These contributors are often experienced statisticians and therefore R and its libraries usually represent state-of-the-art-techniques for data analysis, visualization, etc. R and its libraries are provided with open source licenses (meaning that you can download them free-of-charge, can check the code and can adjust it to your needs).

However, while R certainly is the most powerful tool to conduct statistical analyses, it is (generally) command-line-based (i.e., you write commands which then conduct the analyses). This comes with a somewhat steep learning curve and most people therefore wish to have a graphical user interface. jamovi and JASP provide such graphical user interfaces and aim at the same time for a look-and-feel that is similar to the user interface in SPSS (to ease transition; since quite a number of academics is trained to use SPSS). Both software packages are very like each other and provide generally very similar functionality (the development team for jamovi was earlier a part of the JASP team before they split). Advantages of JASP over jamovi are (a) a stronger focus on Bayes statistic (however, it is possible to run Bayes analyses in jamovi as well using an additional module) and (b) context-help (brief introduction what the statistical procedure does). jamovi has the following advantages over JASP: (a) syntax can be used, i.e., you can take the analyses that you clicked together in the GUI, copy and adjust them so that you can directly use them in R and you can vice versa run most R-code within the Rj editor (an additional module for jamovi); (b) in-program-data-editing (JASP opens data in an external spreadsheet-program like Excel or LibreOffice Calc); (c) jamovi is more recent, therefore based on programming technology that wasn't available when JASP was written, which permits, e.g., to run jamovi in your browser and that it is easier to create modules that add functionality (further analyses) to jamovi.
The package can be downloaded free-of-charge: jamovi and JASP.

PSPP is intended as open source equivalent and replacement for the proprietary program SPSS (you can download them free-of-charge). However, even though basic functionality is implemented, quite some more advanced methods (notably ANOVA for repeated measurements) are currently missing.

Danielle J. Navarro has written an excellent textbook covering a wide range of statistical methods that can also be downloaded for free: the original book was for R; before it was adapted for jamovi (by David R. Foxcroft) and for JASP (by David R. Foxcroft and Thomas J. Faulkenberry). Danielle also started working on a more comprehensive version using R and tidy.

All software packages mentioned above are available for the three major operating systems (Windows, Mac OS X, Linux).