Precision Medicine refers to tailoring of medical treatment to the individual characteristics of each patient. The idea is to get away from prescribing one therapy to all patients with the same disease, but to use all information that is available to be able to treat patients in a personalized way.
Clinical Data Analysis
The overall aim is to build elements of decision support systems that utilizes electronic health records (EHR), genomic information, lab test values and images to provide suggestions for treatments and to assist in the design of new trials. An electronic health record is a digital collection of patient health information. Ideally, every instance of patient care is included in a time-stamped entry to the EHR, with clinical data in a variety of formats such as clinical text notes, pathology images, genomic data and more. This makes for a complex and growing dataset, and an exciting opportunity to develop novel algorithms for use in the biomedical field. We are developing advanced methods for feature extraction and clustering from this growing body of clinical data. We aim to use these features to develop tools which improve patient care, explore multi-scale phenotype learning and help physicians streamline their work.
The aim in network inference is to find interactions between objects. Applications in biomedical analysis include finding inter-dependencies between genes or groups of genes to infer a gene-to-gene regulation network. This can also be done on the protein level in order to infer protein-protein interaction networks.
Sparse Feature Selection
Sparse feature selection is a state-of-the-art approach in high-dimensional data analysis. By forcing the solution to be sparse, better interpretability of the model is expected. For example, in the problem of predicting the outcome of a treatment one is not only interested in the prediction accuracy of the model but also in selecting a small set of variables, e.g. genes, that are the most significant ones for the prediction.
Clustering is the process of discovering groups of related objects. A typical biological application is subtype detection for tumor samples, where clusters containing mutually similar samples are associated with tumor subtypes. In general, a clustering solution may be considered meaningful if patterns within each group are more similar to each other than to patterns in other groups.
In biomedical applications, often high-dimensional data is available but sample size is small. This problem arises for instance in gene expression measurements by measuring the expression values of tens of thousands of genes of only a few patients. In multi-task learning the aim is to learn on many related data sets simultaneously and hence be able to increase the predictive power as compared to learning on every of these data sets separately.