Semantic Search with Chat Interface for Swedish Legislation

In this post, I will present a project that implements semantic search with a chat interface for Swedish legislation and regulations. This project utilizes an OpenAI’s embedding model and a chat completion model to provide accurate and contextual responses to user queries. The project leverages embeddings and a vector database to enable efficient and accurate search results with references to your own data. In the following sections, I will go into the aspects of the different components and their significance in the project....

May 1, 2023 · 6 min · 1269 words · Johannes Skog

Training Large (+7b) Language Models on Chat Data: Using DeepSpeed and LORA for Efficient Training

In this post, I will go through the process of training a large language model on chat data, specifically using the LLaMA-7b model. The fine-tuned model has been shown to perform on par or better than most Hugging Face variants when trained on cleaned alpaca data. I will go into the benefits of using DeepSpeed for training and how LORA (Low-Rank Adaptation) can be used in combination with DeepSpeed to be very efficient during supervised fine-tuning on large models....

April 25, 2023 · 9 min · 1755 words · Johannes Skog

Training & Deploying LLMs: A Step-by-Step Guide

In this post, I will go through all the necessary steps to set up and train a state-of-the-art LLM for sentiment analysis (and many other NLP applications since the steps are almost the same) on Twitter data. I will cover the entire pipeline, from creating the training dataset to deploying the model using TorchServe and Kubernetes on Azure. See github for the code Setting up the Training Pipeline Setting Up the Environment First, we need to set up our environment to run the training and deployment steps....

April 18, 2023 · 7 min · 1358 words · Johannes Skog

Reacher - reach out to a remote..

Often when doing some computationally heavy processing at least two machines are involved, one for the local development (laptop?), with test runs, and one for running the full processing. Setting up and maintaining the correct environment across the two machines can be complex and take time. Switching between local development and remote development is not easy, maybe you want to tweak one line of code on your local machine and re-run the full processing again on the remote machine?...

March 19, 2023 · 6 min · 1074 words · Johannes Skog

Diffusion model

In this post I will write about diffusion models, diffusion model has gotten an upswing with recent successes e.g. (DALL-E). Diffusion models falls in the class of generative models, compared to GANs they are much easier to train and do not require the adversarial setup that can be tricky to get right. There are however some drawbacks, while GANs can produce a one-time inference to get the final results, the diffusion model must traverse a chain of multiple (in some cases $> 4000$) inference steps in order to end up with the final results....

November 15, 2022 · 8 min · 1534 words · Johannes Skog

Autoencoders & Variational Autoencoder

In this post I will write about Autoencoders and Variational Autoencoders, where the former is used to compress some data into a dense representation that can be used in various applications, the latter extends Autoencodes with additional properties which makes it possible to generate data that looks to follow the original data distribution. Both of these fall into the class of Latent Variable Models. Latent Variable Models Latent Variable Models is a class of models that maps an observable variable $X$ to a latent (hidden/unknown) variable $Z$....

November 6, 2022 · 5 min · 929 words · Johannes Skog