Python¶
Powerful programming language, well-maintained by development team.
- Versatile! Can be used as a simple calculator, to organize and manipulate data structures, visualize data, for machine learning, and even manipulate MRI data.
- Macs come pre-installed with python 2.7 but most packages will likely require python 3(+).
- To take advantage of the robustness of python, recommend installing libraries.
- Library = package that uses python to perform pre-set functions
- Some libraries are built upon others–they add functionality to already existing functions.
- Python documentation.
- Resources for learning how to use Python for data science.
Installation¶
To install the latest version of Python, go to Python for Mac, download the macOS 64-bit installer, and follow prompts.
Bread and Butter Libraries¶
Data Organization & Processing
- Pandas: organize, manipulate, plot single column series data, and data frames tables
- NumPy: short for Numerical Python, perform operations on arrays and matrices
- SciPy: conduct linear algebra, integration, optimization, and statistics
Data Visualization
- Matplotlib: plotting, making pretty figures (line graphs, scatterplots, histograms)
- Seaborn: extends Matplotlib for making figures to visualize statistical tests and models (correlation matrices, connectivity matrices, distributions)
- Plotly: web-based tool; graph plotting library for creating and displaying figures, hovering over to show details.
- Bokeh: for interactive and scalable visualizations through web-browsers (aka cool presentations).
Machine Learning/AI
- Scikit-Learn: group of packages in the SciPy stack
- Machine learning and data mining. Think classification, clustering, regression, dimensionality reduction, model selection.
- TensorFlow: artificial intelligence library for creating large-scale neural networks (machine learning + deep learning)
- Keras: neural network library
- TensorFlow’s high-level API for building and training Deep Neural Network code
- Statistical modeling but with images + text