Many problems in large scale social networks, data mining, and machine learning can be solved efficiently by methods drawn from optimization (including convex optimization) or using methods based on stochastic search. This group's research focus is to analyze the convergence behavior of many different algorithms to solve machine learning problems, and to compare the theoretical predictions with the observed behavior from computational experiments. Current work is devoted to rapid scalable graph algorithms in terms of random walks.
This research aims to advance the understanding of hidden representations in machine learning models, with a specific focus on Transformer models. The project uses MSI's computational resources to train, fine-tune, and perform inference on cutting-edge open-source Large Language Models (LLMs) such as Meta's LLAMA. The group's approach involves employing a range of numerical algorithms, from matrix factorizations to clustering and projection techniques. This necessitates the use of MSI's high-capacity VRAM, which is crucial for handling the computational demands of the high-dimensional data analysis. Additionally, the substantial VRAM, RAM, and storage capacities at MSI are vital for efficient batch processing and for storing extensive datasets such as ImageNET, alongside significant model weights. This research contributes to the field of AI.
A particular project is to examine the sparse latent representation proposed from the research group at Anthropic. Usually a latent representation lies in linear space of dimension lower than the dimension of the raw data, but here they propose to use a space of larger dimension but to force a sparse representation. The observation is that the many individual dimensions will each represent semantic features hidden in the data. This can be used to partially explain the output of a neural network and to permit varying individual features to produce sequences of morphed outputs (e.g. images) interpolating between two or more final images. The user can use this to explore the the raw data in a graphical way.