Skills in Statistics, Data Science and Machine Learning
         
        2018-06-30
          
      
      
  Statistics
- Knowledge of Linear Models and Generalised Linear Models
(including logistic regression), both in theory and in
applications
 
- Classical Statistical inference (maximum likelihood estimation,
method of moments, minimal variance unbiased estimators) and
testing (including goodness of fit)
 
- Nonparametric statistics
 
- Bootstrap methods, hidden Markov models
 
- Knowledge of Bayesian Analysis techniques for inference and
testing: Markov Chain Monte Carlo, Approximate Bayesian
Computation, Reversible Jump MCMC
 
- Good knowledge of R for statistical modelling and plotting
 
Data Analysis
- Experience with large datasets, for classification and regression
 
- Descriptive statistics, plotting (with dimensionality reduction)
 
- Data cleaning and formatting
 
- Experience with unstructured data coming directly from embedded
sensors to a microcontroller
 
- Experience with large graph and network data
 
- Experience with live data from APIs
 
- Data analysis with Pandas, xarray (Python) and the tidyverse (R)
 
- Basic knowledge of SQL
 
Graph and Network Analysis
- Research project on community detection and graph clustering
(theory and implementation)
 
- Research project on Topological Data Analysis for time-dependent
networks
 
- Random graph models
 
- Estimation in networks (Stein’s method for Normal and Poisson
estimation)
 
- Network Analysis with NetworkX, graph-tool (Python) and igraph (R
and Python)
 
Time Series Analysis
- experience in analysing inertial sensors data (accelerometer,
gyroscope, magnetometer), both in real-time and in post-processing
 
- use of statistical method for step detection, gait detection, and
trajectory reconstruction
 
- Kalman filtering, Fourier and wavelet analysis
 
- Machine Learning methods applied to time series (decision trees,
SVMs and Recurrent Neural Networks in particular)
 
- Experience with signal processing functions in Numpy and Scipy
(Python)
 
Machine Learning
- Experience in Dimensionality Reduction (PCA, MDS, Kernel PCA,
Isomap, spectral clustering)
 
- Experience with the most common methods and techniques
 
- Random forests, SVMs, Neural Networks (including CNNs and RNNs),
both theoretical knowledge and practical experience
 
- Bagging and boosting estimators
 
- Cross-validation
 
- Kernel methods, reproducing kernel Hilbert spaces, collaborative
filtering, variational Bayes, Gaussian processes
 
- Machine Learning libraries: Scikit-Learn, PyTorch, TensorFlow,
Keras
 
Simulation
- Inversion, Transformation, Rejection, and Importance sampling
 
- Gibbs sampling
 
- Metropolis-Hastings
 
- Reversible jump MCMC
 
- Hidden Markov Models and Sequential Monte Carlo Methods