Python



Powerful programming language, well-maintained by development team.

  • Versatile! Can be used as a simple calculator, to organize and manipulate data structures, visualize data, for machine learning, and even manipulate MRI data.
  • Macs come pre-installed with python 2.7 but most packages will likely require python 3(+).
  • To take advantage of the robustness of python, recommend installing libraries.
    • Library = package that uses python to perform pre-set functions
  • Some libraries are built upon others–they add functionality to already existing functions.
  • Python documentation.
  • Resources for learning how to use Python for data science.

Installation

To install the latest version of Python, go to Python for Mac, download the macOS 64-bit installer, and follow prompts.


Bread and Butter Libraries

Data Organization & Processing

  • Pandas: organize, manipulate, plot single column series data, and data frames tables
  • NumPy: short for Numerical Python, perform operations on arrays and matrices
  • SciPy: conduct linear algebra, integration, optimization, and statistics

Data Visualization

  • Matplotlib: plotting, making pretty figures (line graphs, scatterplots, histograms)
  • Seaborn: extends Matplotlib for making figures to visualize statistical tests and models (correlation matrices, connectivity matrices, distributions)
  • Plotly: web-based tool; graph plotting library for creating and displaying figures, hovering over to show details.
  • Bokeh: for interactive and scalable visualizations through web-browsers (aka cool presentations).

Machine Learning/AI

  • Scikit-Learn: group of packages in the SciPy stack
    • Machine learning and data mining. Think classification, clustering, regression, dimensionality reduction, model selection.
  • TensorFlow: artificial intelligence library for creating large-scale neural networks (machine learning + deep learning)
  • Keras: neural network library
    • TensorFlow’s high-level API for building and training Deep Neural Network code
    • Statistical modeling but with images + text