skip to content


A vast amount of data on human activity is captured by sensors, cameras, computers, and smart phones. This data is typically high-dimensional, sequential, complex, heterogeneous, and multimodal (for example, comprising images and text) in nature, but of small sample size. New techniques for predicting patterns, and thus extracting meaningful and useful information from this “data deluge” are emerging, providing a huge opportunity for significant societal benefit. One such tool is Rough Path Theory, a sub-branch of stochastic analysis, which can be used to describe complex behaviour concisely.

In the rapidly moving field of data science, Rough Path Theory can add significant value to existing methods. There is a rich range of real-world streamed data, which, for example, can be recorded at different times and in different amounts. In contrast to conventional methods for describing sequential data (where order matters), it can efficiently describe this data in terms of the sequence of events without introducing a parameterisation, resulting in a massive and controlled dimension reduction. These simplified “top-down” descriptions of the data offer great potential for facilitating the use of data science in understanding social and human data.