How long does the LM Studio tutorial take to complete?

The six steps in this LM Studio tutorial take roughly 20 to 35 minutes the first time through, depending on your internet speed for the model download and how much time you spend adjusting chat settings. Steps 1 through 3 typically take under 10 minutes combined on a fast connection.

Which model should I use for my first LM Studio tutorial run?

For a first session, a Q4_K_M quantization of a 7B-parameter instruct model is the safest choice. It loads on 8 GB of RAM, runs at readable speed on CPU-only hardware, and produces coherent answers on standard tasks. Llama 3 8B Instruct and Mistral 7B Instruct are both reliable picks from the in-app library.

Do I need a GPU to follow this LM Studio tutorial?

No. The tutorial works on CPU-only hardware, though inference will be slower. If you have an NVIDIA GPU, LM Studio detects CUDA automatically and the layer-offload slider in the model load dialog will accelerate most of the work onto the card.

Can I follow the tutorial on Linux?

Yes. The steps are identical across platforms. Linux users should make the AppImage executable with chmod +x before launching; the tutorial notes this in step 1. After that, the UI behaves the same as on Windows and macOS.

What comes after completing the LM Studio tutorial?

After finishing the tutorial, good next steps are the server mode page for wiring LM Studio to external clients, the API page for making programmatic calls, and the performance page for squeezing more speed out of your hardware. The alternatives page is useful if you want to compare LM Studio against other local inference tools after trying it.

LM Studio Tutorial | First-Run Walkthrough

Practical Recap

This LM Studio tutorial covers the full first session: platform installer, model browser, download, load, chat, and server mode. Each step lists an estimated time and a concrete outcome so you know when to move on.

Before you begin

Two things to confirm before running this LM Studio tutorial: your machine has at least 8 GB of RAM available, and you have a reliable internet connection for the model download in step 3.

This tutorial targets a complete first session with LM Studio. It assumes no prior experience with local inference tools, GGUF models, or command-line API calls. The only prerequisite is a machine running Windows 10 or 11, macOS 13 or later, or a mainstream Linux distribution. If you are on Linux, you will need to make one file executable before launching — step 1 notes exactly how.

Hardware-wise, 8 GB of RAM is the practical floor for the 7B model used in steps 3 and 4. A GPU is not required: LM Studio falls back to CPU inference automatically, which is slower but fully functional. If your machine has an NVIDIA, AMD, or Apple Silicon GPU, LM Studio will detect it and surface an offload option during model loading.

Six-step tutorial: from install to server mode

Steps 1 through 4 cover setup and model loading; steps 5 and 6 cover interactive chat and the local API — the two things most people actually want to do.

LM Studio tutorial — step, estimated time, and expected outcome
Step	Estimated time	Outcome
1. Download and install LM Studio	3–5 min	Application launches and shows the home screen
2. Open the model browser	1 min	Discover tab shows a grid of available models
3. Download a model	5–20 min (network-dependent)	Progress bar completes; model file appears on disk
4. Load the model and open chat	1–3 min	Model loaded indicator appears; chat input is active
5. Write a system prompt and first message	2–5 min	Model returns a coherent response in the chat window
6. Enable server mode and verify with curl	2–3 min	curl returns a JSON completion response from localhost

Step 1 — Download and install LM Studio

Navigate to the LM Studio download page and pick the installer that matches your platform. On Windows, run the .exe and follow the standard wizard. On macOS, open the .dmg and drag the application to your Applications folder. On Linux, save the AppImage to a convenient location, then open a terminal and run chmod +x LMStudio-*.AppImage before double-clicking to launch.

When the application opens, you will see a home screen with a sidebar containing five icons: Discover, My Models, Chat, Server, and Settings. Take a moment to notice the status bar at the bottom of the window — it shows whether a model is loaded, the current server state, and basic hardware information.

Step 2 — Open the model browser

Click the Discover icon (the first item in the left sidebar, which looks like a compass or search symbol). The browser loads a grid of cards, each representing a model family. Each card shows the model name, parameter count, and a hardware-fit badge — green means the model is likely to run well on your hardware, yellow means it is a stretch, and red means it exceeds detected capacity.

Use the search bar at the top to filter by name. For this tutorial, type llama-3-8b-instruct or mistral-7b-instruct. Either will work. The browser will return a list of quantized variants for your chosen model, sorted by file size.

Step 3 — Download a model

From the variant list, select the row labeled Q4_K_M. This quantization level gives a good balance between response quality and memory footprint: a 7B Q4_K_M model sits around 4.1–4.4 GB on disk and needs roughly the same amount in RAM. Click the Download button next to that row. A progress bar appears at the bottom of the screen. Depending on your connection speed, this step takes between five minutes on a fast connection and twenty minutes on a slower one. The file is saved to a local models directory — you can change this path in Settings if needed.

Step 4 — Load the model and open chat

Once the download finishes, click the model's name in the My Models view (second sidebar icon) or simply click Load from the Discover card. A load dialog appears with an optional GPU offload slider. If LM Studio detected a compatible GPU, drag the slider to the right to offload as many layers as your VRAM allows — more layers on the GPU means faster token generation. Click Load and wait for the status bar to change from "No model loaded" to the model name with a green indicator.

Now click the Chat icon (third sidebar item). A session window opens with two input areas: the system prompt at the top and the user message field at the bottom.

Step 5 — Write a system prompt and send your first message

In the system prompt field, type something like: You are a helpful assistant. Answer concisely and accurately. This sets the behavior for the session. In the user message field below, type a question — for example: What are the main differences between GGUF and GGML model formats? Press Enter or click the Send button.

The model begins generating tokens almost immediately. You will see the response appear word by word in the chat transcript. For a Q4_K_M 7B model on a mid-range CPU, expect roughly 5–15 tokens per second; on a GPU, that number rises to 40–80 tokens per second or higher depending on the hardware. When the response finishes, the input field clears and is ready for your next message. This is a successful first chat session with LM Studio.

Step 6 — Enable server mode and verify with curl

Click the Server icon (fourth sidebar item). You will see a toggle labeled Start server and a port field showing 1234 by default. Click Start server. The status in the panel turns green and shows the endpoint address: http://localhost:1234/v1.

Open a terminal on the same machine and run the following command:

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "local-model",
    "messages": [{"role":"user","content":"Say hello in one sentence."}]
  }'

The terminal should print a JSON object with a choices array and a message.content field containing the model's response. That is the LM Studio server responding to an OpenAI-compatible API call. Any tool or application that already speaks the OpenAI Chat Completions schema — Python SDKs, JavaScript libraries, code editors with AI extensions — can now point its base URL at http://localhost:1234/v1 and work without further modification.

What to explore after this tutorial

Server mode verified means the hardest part is done — from here, the workflow expands into presets, multi-turn sessions, and connecting external tools.

With a model loaded and server mode running, three directions are worth exploring. First, try chat presets: in the Chat settings panel you can save combinations of system prompt, temperature, top-p, and stop tokens as named presets, then switch between them without resetting the session. Second, experiment with quantization levels — load the same model family at Q5_K_M or Q8_0 and compare response quality versus generation speed for your specific use case. Third, wire up an external client: if you use a code editor with an AI extension or a notebook environment, point its configuration at http://localhost:1234/v1 and your local model becomes the backend.

For deeper background on how local inference works, the NIST AI resource hub covers evaluation frameworks that are relevant when you start assessing model quality systematically. The Stanford Human-Centered AI group publishes accessible research on language model behavior that can sharpen your intuitions about prompt design.

The documentation index maps every topic on this site. The troubleshooting page covers what to do if a step in this tutorial did not behave as described. The vs-Ollama comparison is worth reading once you have a feel for LM Studio's workflow.

LM Studio tutorial: a first-run walkthrough from install to first prompt

Practical Recap

Before you begin

Six-step tutorial: from install to server mode

Step 1 — Download and install LM Studio

Step 2 — Open the model browser

Step 3 — Download a model

Step 4 — Load the model and open chat

Step 5 — Write a system prompt and send your first message

Step 6 — Enable server mode and verify with curl

What to explore after this tutorial

Frequently asked questions

Popular searches

LM Studio tutorial: a first-run walkthrough from install to first prompt

Practical Recap

Before you begin

Six-step tutorial: from install to server mode

Step 1 — Download and install LM Studio

Step 2 — Open the model browser

Step 3 — Download a model

Step 4 — Load the model and open chat

Step 5 — Write a system prompt and send your first message

Step 6 — Enable server mode and verify with curl

What to explore after this tutorial

Related pages

Frequently asked questions

Popular searches