How to Install and Run LLMs Locally on Android Phones

In recent years, the capabilities of language models have grown exponentially, transforming the way we interact with technology. From generating human-like text to providing intelligent responses, these models have become integral to various applications. However, accessing these models often requires an internet connection, limiting their usability in offline scenarios. But what if you could harness the power of language models directly on your Android phone, without relying on the internet? In this guide, we’ll explore how to install and run Large Language Models (LLMs) locally on Android devices.

Understanding LLMs

Large Language Models, such as GPT (Generative Pre-trained Transformer) models, are deep learning models trained on vast amounts of text data. These models can understand and generate human-like text, making them incredibly versatile for tasks like language translation, text summarization, and conversation generation.

Benefits of Running LLMs Locally

Running LLMs locally on your Android device offers several advantages:

Offline Accessibility: By installing LLMs on your phone, you can access their capabilities even without an internet connection, making them ideal for scenarios where connectivity is limited or unavailable.
Privacy: Running LLMs locally enhances privacy by processing data directly on your device, reducing the need to send sensitive information over the internet to remote servers.
Faster Response Times: Local execution of LLMs can result in faster response times compared to relying on remote servers, especially in situations where latency is a concern.

Install the MLC Chat App

The MLC Chat app is not available on the Google Play Store. Hence, you have to sideload the app on your device. Follow the steps below.

Note: The MLC Chat app is still in the demo and is made specifically for the Galaxy S23 devices powered by the Snapdragon 8 Gen 2 chip.

1. Download and install the MLC Chat app on your device. You can download the app from the official website for free (148 MB).

2. After installing the MLC Chat app, you need to download the desired LLM model to use it for AI chatting. Make sure you are connected to the internet.

3. I recommend downloading the Phi-2 model first since it should run fine on most devices.

4. Tap on the Download button next to the model name to download it to your device. The download size and speed may vary depending on your internet connection.

5. Once the model is downloaded, you can let go of the internet and tap on the Chat icon to start using the model that you just downloaded.

6. You can now start chatting with the AI model locally. Note that you can give input and get output in text format only.

Model List:

Llama3-8B-Instruct-q3f16-MLC
gemma-2b-q4f16_1 (not working)
phi-2-q4f16_1 (recommended)
Llama-2-7b-chat-hf-q4f16_1
Mistral-7B-Instruct-v0.2-q4f16
RedPajama-INCITE-Chat-3B-v1-q4f16_1

Using the MLC Chat App

The MLC Chat app allows you to store LLMs on your device. At the time of writing, the MLC Chat app cannot leverage the NPU on your smartphone chip; instead, it uses the CPU to run these LLMs locally.

Moreover, the app is optimized for the Snapdragon chipsets only, particularly the Snapdragon 8 Gen 2 found on the Galaxy S23. So, if you have a MediaTek device like I do, the number of tokens per second might be low.

For perspective, I was getting somewhere around 1.5 to 2 tokens per second (prefill) and about 4 tokens per second for decoding on my POCO X6 Pro which is powered by the MediaTek Dimensity 8300 Ultra chip. For the uninitiated, more number of tokens means faster and better performance.

tap on the Chat icon to start using your downloaded LLM

Copy Text

Once you’ve got your response, you can tap and hold the text to select and copy it or continue the conversation with more prompts.

Reset the Chat

If you’d like to end a chat or conversation and start a new one, tap on the reset icon at the top right corner of the interface.

Conclusion

By following these steps, you can harness the power of Large Language Models directly on your Android phone, enabling offline access to advanced language processing capabilities. Whether you’re building a custom application or simply experimenting with LLMs on your device, running these models locally opens up a world of possibilities for enhancing productivity, privacy, and user experience. So why wait? Dive in and unlock the full potential of language models on your Android device today!

Understanding LLMs

Benefits of Running LLMs Locally

Install the MLC Chat App

Using the MLC Chat App

Copy Text

Reset the Chat

Conclusion

How to Recover Deleted Photos and Videos on Android in 2025

How to Sign a PDF File on Android: A Step-by-Step Guide

Fix “Something Went Wrong, Try Again” Error on Google Play Store (Android)

How to Install and Run LLMs Locally on Android Phones

Understanding LLMs

Benefits of Running LLMs Locally

Install the MLC Chat App

Using the MLC Chat App

Copy Text

Reset the Chat

Conclusion

Also check this

How to Recover Deleted Photos and Videos on Android in 2025

How to Sign a PDF File on Android: A Step-by-Step Guide

Fix “Something Went Wrong, Try Again” Error on Google Play Store (Android)