Statistics
- Knowledge of Linear Models and Generalised Linear Models
(including logistic regression), both in theory and in
applications
- Classical Statistical inference (maximum likelihood estimation,
method of moments, minimal variance unbiased estimators) and
testing (including goodness of fit)
- Nonparametric statistics
- Bootstrap methods, hidden Markov models
- Knowledge of Bayesian Analysis techniques for inference and
testing: Markov Chain Monte Carlo, Approximate Bayesian
Computation, Reversible Jump MCMC
- Good knowledge of R for statistical modelling and plotting
Data Analysis
- Experience with large datasets, for classification and regression
- Descriptive statistics, plotting (with dimensionality reduction)
- Data cleaning and formatting
- Experience with unstructured data coming directly from embedded
sensors to a microcontroller
- Experience with large graph and network data
- Experience with live data from APIs
- Data analysis with Pandas, xarray (Python) and the tidyverse (R)
- Basic knowledge of SQL
Graph and Network Analysis
- Research project on community detection and graph clustering
(theory and implementation)
- Research project on Topological Data Analysis for time-dependent
networks
- Random graph models
- Estimation in networks (Stein’s method for Normal and Poisson
estimation)
- Network Analysis with NetworkX, graph-tool (Python) and igraph (R
and Python)
Time Series Analysis
- experience in analysing inertial sensors data (accelerometer,
gyroscope, magnetometer), both in real-time and in post-processing
- use of statistical method for step detection, gait detection, and
trajectory reconstruction
- Kalman filtering, Fourier and wavelet analysis
- Machine Learning methods applied to time series (decision trees,
SVMs and Recurrent Neural Networks in particular)
- Experience with signal processing functions in Numpy and Scipy
(Python)
Machine Learning
- Experience in Dimensionality Reduction (PCA, MDS, Kernel PCA,
Isomap, spectral clustering)
- Experience with the most common methods and techniques
- Random forests, SVMs, Neural Networks (including CNNs and RNNs),
both theoretical knowledge and practical experience
- Bagging and boosting estimators
- Cross-validation
- Kernel methods, reproducing kernel Hilbert spaces, collaborative
filtering, variational Bayes, Gaussian processes
- Machine Learning libraries: Scikit-Learn, PyTorch, TensorFlow,
Keras
Simulation
- Inversion, Transformation, Rejection, and Importance sampling
- Gibbs sampling
- Metropolis-Hastings
- Reversible jump MCMC
- Hidden Markov Models and Sequential Monte Carlo Methods