Android

How Kotlin Coroutines Work Internally in Android

Anand Gaur

Mobile Tech Lead · 22 Jun 2026

How Kotlin Coroutines Work Internally in Android

A from-scratch guide that goes from “what is a coroutine” all the way down to the compiler-generated state machine that actually makes the magic happen.

If you’ve written Android apps in the last few years, you’ve almost certainly written something like this:

viewModelScope.launch {
    val user = repository.getUser()      // network call
    val posts = repository.getPosts(user.id)  // another network call
    _uiState.value = HomeState.Success(user, posts)
}

It looks like ordinary, top-to-bottom code. No callbacks, no nested Runnables, no "callback hell." And yet, somehow, those two network calls don't block the main thread, your UI stays smooth, and the result lands exactly where you expect it.

That feels like magic. It isn’t. There is a very precise, very clever mechanism running underneath, and once you see it, coroutines stop being a black box. This article walks you down that ladder one rung at a time until you can picture exactly what the Kotlin compiler does to your suspend functions.

We’ll go in this order:

The problem coroutines actually solve
What a coroutine really is (hint: not a thread)
The heart of it all: suspend functions
Continuation-Passing Style — the idea behind everything
The state machine — what the compiler really generates
The Continuation interface and how resume works
Dispatchers: who runs the code, and where
Structured concurrency: scopes, jobs, and why your coroutines don’t leak
A complete real-world Android example
Common mistakes that suddenly make sense

Let’s begin.

1. The Problem Coroutines Solve

Android has one golden rule: never block the main thread. The main thread draws your UI roughly 60 times per second. If you make it wait for a network response, a database query, a file read the frames stop, and the user sees jank or the dreaded “Application Not Responding” dialog.

So we have to do slow work somewhere else. For years, the tools for this were painful.

Callbacks were the first answer:

api.getUser(object : Callback {
    override fun onSuccess(user: User) {
        api.getPosts(user.id, object : Callback {
            override fun onSuccess(posts: List<Post>) {
                runOnUiThread { updateUi(user, posts) }
            }
            override fun onError(e: Exception) { /* ... */ }
        })
    }
    override fun onError(e: Exception) { /* ... */ }
})

This is “callback hell.” Two simple sequential operations turned into a deeply nested pyramid, with error handling scattered everywhere. Add a third call and it gets worse.

RxJava improved this with chains and operators, but brought a steep learning curve and its own vocabulary of Observable, Single, flatMap, subscribeOn, observeOn.

Coroutines solve the same problem with a radically simpler promise: write asynchronous code that reads like synchronous code. The two network calls from the intro look sequential because, to your eyes, they are sequential but no thread is ever blocked while waiting.

The obvious question is: how? How can code “pause” at a network call and “resume” later without holding a thread hostage the entire time? That is the whole story of this article.

2. What a Coroutine Really Is (Not a Thread)

The single most important mental shift: a coroutine is not a thread.

A thread is an expensive, OS-managed resource. On Android you can realistically have a few dozen before memory and context-switching costs hurt you. Each thread reserves stack space (often around 1 MB) whether it’s doing work or just sitting idle.

A coroutine, by contrast, is essentially a piece of work that can suspend and resume. It’s a lightweight object that runs on a thread, but isn’t tied to one. You can launch hundreds of thousands of coroutines on a single Android device because each one is, in memory terms, almost free.

Here’s the analogy I like. Think of a thread as a chef in a kitchen, and coroutines as recipes the chef is cooking. A traditional blocking call is like a chef who starts boiling pasta and then stands there staring at the pot until it’s done — useless, blocked, unavailable. A suspending coroutine is the chef who puts the pasta on, sets a mental note (“come back when the timer rings”), and immediately starts chopping vegetables for another dish. One chef, many recipes in flight, nobody standing idle.

When a coroutine “suspends” at a network call, it doesn’t block the chef (the thread). It steps aside, frees the thread to do other work, and the runtime makes a note to resume it once the network result is ready. Suspension frees the thread; it does not freeze it.

That’s the behavior. Now let’s open the engine.

3. The Heart of It All: `suspend` Functions

Everything special about coroutines lives in one keyword: suspend.

suspend fun getUser(): User {
    delay(1000)        // suspends, does NOT block
    return User("Anand")
}

A suspend function is a function that can pause its execution and resume later, without blocking the thread it's running on.

Two rules follow immediately, and they confuse almost everyone at first:

A suspend function can only be called from another suspend function, or from inside a coroutine builder like launch or async. You cannot call it from regular main() or an onClick directly.
A suspend function does not run on a background thread by itself. suspend is about suspension, not about threading. Whether it runs on the main thread or a background thread is decided by the dispatcher (we'll get there in section 7). This is the most common misconception in the entire topic.

So what does suspend actually do to the function? This is the crux. The keyword is a signal to the Kotlin compiler: "Transform this function so it can pause and resume." And the way the compiler does that is genuinely elegant. To understand it, we need one foundational idea.

4. Continuation-Passing Style — The Idea Behind Everything

Here is the central trick. When you write:

suspend fun getUser(): User { ... }

the compiler does not compile a function that returns User. Under the hood, it rewrites the signature to something like this (simplified Kotlin pseudocode):

fun getUser(continuation: Continuation<User>): Any?

Two things changed:

An extra hidden parameter appeared: a Continuation.
The return type became Any?.

This rewriting is called Continuation-Passing Style (CPS), and it’s the foundation of the entire coroutines machine.

So what is a Continuation? Think of it as a callback that represents "the rest of the program after this point." When a suspend function finishes (or has a result ready), instead of returning a value the normal way, it calls back into the continuation, handing over the result. The continuation knows what to do next.

This is the deep insight: callbacks never actually went away. Coroutines didn’t abolish callbacks they hid them. The compiler generates the callback plumbing for you, automatically, so your source code stays clean and sequential. You write straight-line code; the compiler converts it into callback-driven code behind the curtain.

Why does the return type become Any?? Because a suspend function now has two possible outcomes:

It finished without suspending, and returns its actual result (a User).
It hit a suspension point (like delay or a network call) and had to pause. In that case it returns a special marker object called COROUTINE_SUSPENDED.

That marker is how the runtime knows: “this function paused; don’t expect a value yet — it’ll come back via the continuation.” Any? is simply the type that can hold either the real result or COROUTINE_SUSPENDED.

Now we have the two pieces: a hidden Continuation parameter, and a way to signal "I paused." The next question is how a single function can pause partway through and then resume from exactly where it left off. That requires the state machine.

5. The State Machine — What the Compiler Really Generates

This is the part that makes everything click.

Consider a suspend function with multiple suspension points:

suspend fun loadDashboard() {
    val user = getUser()           // suspension point 1
    val posts = getPosts(user.id)  // suspension point 2
    show(user, posts)
}

A normal function executes top to bottom in one go. But this function might pause at getUser(), vanish for a second while the network responds, then need to resume on the very next line — with the user variable still intact. How can a function "remember" where it stopped and what its local variables were?

The compiler’s answer: it shreds the function into pieces and wraps it in a state machine. Each suspension point becomes a labeled state. Conceptually, the compiler transforms the code above into something like this (heavily simplified, but faithful to the idea):

fun loadDashboard(continuation: Continuation<Unit>): Any? {

    // The compiler creates ONE object that stores both the
    // current state (label) and all local variables across suspensions.
    val sm = continuation as? DashboardStateMachine
        ?: DashboardStateMachine(continuation)

    when (sm.label) {
        0 -> {
            sm.label = 1
            val result = getUser(sm)           // pass the state machine as the continuation
            if (result == COROUTINE_SUSPENDED) return COROUTINE_SUSPENDED
            sm.user = result as User
            // fall through if getUser didn't actually suspend
        }
        1 -> {
            sm.user = sm.result as User         // result delivered by resumeWith
            sm.label = 2
            val result = getPosts(sm.user.id, sm)
            if (result == COROUTINE_SUSPENDED) return COROUTINE_SUSPENDED
            sm.posts = result as List<Post>
        }
        2 -> {
            sm.posts = sm.result as List<Post>
            show(sm.user, sm.posts)
            return Unit
        }
    }
}

Read that slowly, because it contains the entire secret:

label is the program counter. It records which step the function is on. Before each suspension point, the label is bumped to the next value, so when the function is re-entered it jumps straight to the right when branch.
The state machine object stores the local variables (user, posts). This is the key to "remembering" local variables aren't kept on the thread's call stack (which would be gone after suspension). They're stored as fields on a heap object that survives the pause.
After every suspension point, the function checks if (result == COROUTINE_SUSPENDED) return COROUTINE_SUSPENDED. If the called function paused, this function immediately returns the marker too, unwinding the stack and freeing the thread.
When the awaited work completes, the runtime calls back into this same function but now label has advanced, so execution resumes at the next branch, with all locals restored.

This is why a suspend function with three suspension points compiles, roughly, into a when block with four states. The function isn't one continuous flow anymore; it's a resumable machine that can be paused and re-entered as many times as there are suspension points.

And critically: all of this happens at compile time. There’s no interpreter, no reflection, no runtime parsing of your code. The state machine is plain bytecode, generated once, and it’s extremely efficient. That’s why coroutines are fast.

Let me restate the whole thing in one sentence, because it’s worth memorizing: the compiler turns each suspend function into a single object that holds a label (where am I?) and the local variables (what did I know?), and re-enters that object at the right spot every time the work it was waiting on completes.

6. The `Continuation` Interface and How `resume` Works

Now we can name the object we’ve been describing. That state machine is a Continuation. The interface is tiny:

interface Continuation<in T> {
    val context: CoroutineContext
    fun resumeWith(result: Result<T>)
}

Just two members. context we'll cover in the next section. The star of the show is resumeWith.

When the thing your coroutine was waiting for is ready — the network call returns, the delay timer fires — someone calls continuation.resumeWith(Result.success(theValue)). That single call is what "wakes the coroutine up." resumeWith re-enters the state machine, which reads its label, jumps to the correct branch, and carries on.

Result<T> is also why coroutine exception handling feels natural. It can hold either a success value or a failure. If the awaited operation threw, the runtime calls resumeWith(Result.failure(exception)), and inside the resumed state machine that becomes a regular thrown exception — which is why you can wrap suspend calls in an ordinary try/catch:

try {
    val user = getUser()   // if this fails, the exception surfaces right here
} catch (e: IOException) {
    // handle it like any normal exception
}

No special error callbacks. The CPS machinery routes failures back into your code as plain exceptions at the exact line that was suspended.

So the full cycle is: suspend → return COROUTINE_SUSPENDED and free the thread → wait → resumeWith delivers the result (or error) → state machine re-enters at the right label → continue. That loop, repeated at each suspension point, is the beating heart of every coroutine you've ever written.

7. Dispatchers: Who Runs the Code, and Where

We now understand how a coroutine pauses and resumes. But one question remains: which thread does the code actually run on? That’s the job of the dispatcher, part of the CoroutineContext.

When resumeWith is called, the dispatcher decides which thread the resumed work executes on. This is what gives you control over threading without ever touching a Thread object directly.

Android developers care about three dispatchers:

Dispatchers.Main — runs on Android's main/UI thread. Use it to touch the UI. On Android this is Dispatchers.Main, backed by the main Looper.
Dispatchers.IO — a shared pool optimized for blocking I/O: network calls, database reads, file access. It has a large pool of threads because I/O work spends most of its time waiting.
Dispatchers.Default — a pool sized to the number of CPU cores, meant for CPU-intensive work: parsing large JSON, sorting big lists, image processing.

The classic pattern is to switch dispatchers with withContext:

suspend fun getUser(): User = withContext(Dispatchers.IO) {
    // this block runs on an IO thread — safe to do blocking network work
    api.fetchUser()
}

When you call this from a Dispatchers.Main coroutine, here's the sequence: the coroutine suspends on the main thread, the work hops to an IO thread, and when it's done, withContext automatically dispatches the resumption back to the main thread. You get clean thread-switching with what reads like a single function call. Under the hood it's still the state machine plus the dispatcher choosing the thread for each resumeWith.

This is also why I emphasized earlier that suspend doesn't mean "background thread." A suspend fun runs on whatever dispatcher its caller is using. If you want it off the main thread, you (or the function, via withContext) must say so.

8. Structured Concurrency: Scopes, Jobs, and Why Coroutines Don’t Leak

A coroutine doesn’t float around freely. It lives inside a CoroutineScope, and this is one of the most important practical features Kotlin added.

Structured concurrency means every coroutine has a parent, forms a hierarchy, and is bound to a lifecycle. The benefits are huge:

If a parent scope is cancelled, all its child coroutines are cancelled automatically.
A parent waits for all its children to finish before completing.
An unhandled error in a child propagates predictably up the hierarchy.

On Android, this maps perfectly onto component lifecycles. The framework gives you ready-made scopes:

viewModelScope — tied to a ViewModel. When the ViewModel is cleared (the screen goes away for good), every coroutine in this scope is cancelled. No leaked network calls trying to update a dead screen.
lifecycleScope — tied to an Activity or Fragment lifecycle.

This is what quietly prevents an entire category of bugs. In the callback era, a network response could arrive after the user left the screen and crash the app trying to touch a destroyed view. With viewModelScope, that coroutine is simply cancelled when the ViewModel dies — automatically.

The mechanism behind this is the Job. Every coroutine returns a Job (or a Deferred, which is a Job that also carries a result). The Job represents the coroutine's lifecycle you can cancel() it, join() it (wait for it), and inspect its state. Jobs form a parent-child tree, and that tree is exactly the structured-concurrency hierarchy. The CoroutineContext we keep mentioning is essentially a bundle holding the Job, the dispatcher, an optional CoroutineExceptionHandler, and a name — everything the runtime needs to manage that coroutine.

9. A Complete Real-World Android Example

Let’s tie every concept together in a realistic screen: a user profile that loads a user and their posts in parallel, switches threads correctly, handles errors, and cancels cleanly.

class ProfileViewModel(
    private val repository: ProfileRepository
) : ViewModel() {

    private val _uiState = MutableStateFlow<ProfileUiState>(ProfileUiState.Loading)
    val uiState: StateFlow<ProfileUiState> = _uiState.asStateFlow()

    fun loadProfile(userId: String) {
        // viewModelScope = structured concurrency.
        // Cancelled automatically when the ViewModel is cleared.
        viewModelScope.launch {
            try {
                // async launches concurrent children that return a Deferred.
                // Both calls start immediately and run in parallel.
                val userDeferred = async { repository.getUser(userId) }
                val postsDeferred = async { repository.getPosts(userId) }

                // await() suspends until each result is ready —
                // without blocking the main thread.
                val user = userDeferred.await()
                val posts = postsDeferred.await()

                // We're back on the main dispatcher here, so updating
                // UI state is safe.
                _uiState.value = ProfileUiState.Success(user, posts)

            } catch (e: Exception) {
                // CPS routes the failure back here as a normal exception.
                _uiState.value = ProfileUiState.Error(e.message ?: "Unknown error")
            }
        }
    }
}

class ProfileRepository(private val api: ApiService) {

    // suspend + withContext(IO): the actual blocking network work
    // runs on an IO thread, then resumption hops back automatically.
    suspend fun getUser(userId: String): User = withContext(Dispatchers.IO) {
        api.fetchUser(userId)
    }

    suspend fun getPosts(userId: String): List<Post> = withContext(Dispatchers.IO) {
        api.fetchPosts(userId)
    }
}

Now read it through the lens of everything we’ve covered:

viewModelScope.launch creates a coroutine bound to the ViewModel's lifecycle. If the user leaves the screen mid-load, the Job is cancelled and both network calls stop no leak, no crash.
async { ... } starts two coroutines that run concurrently. This is the real win over sequential await calls: both requests are in flight at once, so total time is roughly the slower of the two, not the sum. (If you wrote repository.getUser() then repository.getPosts() directly, they'd run one after another.)
await() is a suspension point. The compiler turns this launch block into a state machine with labels at each await. When the first result isn't ready, the coroutine returns COROUTINE_SUSPENDED, freeing the main thread to keep rendering frames.
withContext(Dispatchers.IO) moves the blocking work off the main thread and hops back when done so the _uiState.value = ... line safely runs on the main thread.
try/catch works because failures come back through resumeWith(Result.failure(...)) and re-surface as ordinary exceptions at the suspended line.

Every single mechanism from sections 4 through 8 is doing its job in these forty lines. That’s the payoff of understanding the internals: this code stops looking like incantation and starts looking like clockwork you can reason about.

10. Common Mistakes That Suddenly Make Sense

Once you understand the machine, the classic coroutine bugs become obvious instead of mysterious.

Mistake 1: Expecting suspend to move work off the main thread.

suspend fun loadData(): Data {
    return heavyJsonParsing(file)  // STILL on the main thread if called from Main!
}

suspend only enables suspension; it does not change dispatchers. CPU work like parsing must be wrapped in withContext(Dispatchers.Default), or it'll freeze your UI exactly as if it were a regular function.

Mistake 2: Catching CancellationException and swallowing it.

try {
    doWork()
} catch (e: Exception) {   // accidentally catches CancellationException too
    log(e)
}

When a coroutine is cancelled, the runtime throws CancellationException to unwind it. If you catch all exceptions and swallow it, you break structured concurrency's cancellation. Either catch specific exceptions, or rethrow CancellationException.

Mistake 3: Using GlobalScope in Android.

GlobalScope.launch { ... } creates a coroutine with no lifecycle parent. It won't be cancelled when your screen dies the exact memory-leak problem structured concurrency was built to prevent. Use viewModelScope or lifecycleScope instead, essentially always.

Mistake 4: Forgetting that cancellation is cooperative.

Cancelling a Job doesn't forcibly kill a thread. The coroutine only actually stops at the next suspension point, or when it checks isActive. A tight CPU loop with no suspension points will ignore cancellation entirely. For long computations, periodically call ensureActive() or yield() so cancellation can take effect.

Each of these is just the internals leaking through. Knowing the machine is knowing the bug.

Summary

Let’s collapse the whole journey into a single mental model you can carry around:

A coroutine is lightweight resumable work that runs on threads but isn’t bound to one. Suspension frees the thread instead of blocking it.
The suspend keyword tells the compiler to rewrite a function in Continuation-Passing Style adding a hidden Continuation parameter and an Any? return type that can hold either the result or the COROUTINE_SUSPENDED marker.
The compiler shreds each suspend function into a state machine: a single object holding a label (where execution is) and the local variables (what it knew), so it can pause and re-enter at the exact right spot.
A Continuation is the callback the compiler generates for you; resumeWith wakes the machine and delivers a Result that surfaces as a normal value or a normal exception.
Dispatchers decide which thread runs the resumed code; withContext switches threads cleanly.
Structured concurrency ties coroutines to lifecycles through scopes and Jobs, so they cancel automatically and never leak.

Coroutines aren’t magic. They’re a compile-time transformation that turns your clean, sequential code into an efficient, callback-driven state machine and then hides every bit of that plumbing so you never have to see it. The next time you write viewModelScope.launch { }, you'll know exactly what the compiler built underneath: a little resumable machine, quietly pausing and resuming, never holding a thread hostage, never leaking a screen.

That’s the whole story, top to bottom.

If this helped clarify coroutines for you, follow along for more deep-dives into Android internals, on-device AI, and modern mobile architecture.

Level Up Your Mobile Developer Interview !

Mastering AI for Android Developers

Your complete hands-on guide to integrating AI into Android apps covering Generative AI, LLMs, on-device intelligence, AI APIs, real-world use cases, and practical implementation with modern Android development.
👉 Grab your copy now:
https://medium.com/@anandgaur2207/mastering-ai-for-android-developers-5cc6d62e7d21

Cracking the Mobile System Design Interview Book

Your complete practical guide to mastering Mobile System Design Interviews covering scalable architecture, Android & iOS system design concepts, high-level design strategies, low-level design patterns, performance optimization, offline-first architecture, real-world case.
👉 Grab your copy now:
https://medium.com/@anandgaur2207/cracking-the-mobile-system-design-interview-book-8ff043db0359

Crack Android Interviews Like a Pro

Your complete Android interview preparation book packed with real questions, deep explanations, and practical insights to help you stand out.
👉 Grab your copy now:
https://medium.com/@anandgaur2207/crack-android-interviews-with-confidence-the-only-handbook-youll-need-b87ec525f19c

iOS Developer Interview Handbook

From Swift fundamentals to advanced iOS concepts — a complete handbook to help you prepare smartly and confidently.
👉 Explore the book:
https://medium.com/@anandgaur2207/crack-ios-developer-interviews-with-confidence-the-complete-ios-developer-handbook-f1eabc3d7a21

Flutter Developer Interview Handbook

Ace your next Flutter interview with scenario-based questions, detailed explanations, and hands-on examples that make you stand out.
👉 Explore the book:
https://medium.com/@anandgaur2207/crack-flutter-developer-interviews-with-confidence-the-complete-flutter-developer-interview-6cb53996832c

React Native Developer Interview Handbook

Crack your next React Native interview with confidence!
This guide is packed with scenario-based questions, detailed explanations, and hands-on examples to help you stand out and succeed.
👉 Explore the book:
https://medium.com/@anandgaur2207/react-native-interview-crack-your-next-interview-with-confidence-0d7255a20fe1

Need 1:1 Career Guidance or Mentorship?

If you’re looking for personalized guidance, interview preparation help, or just want to talk about your career path in mobile development — you can book a 1:1 session with me on Topmate.

🔗 Book a session here

I’ve helped many developers grow in their careers, switch jobs, and gain clarity with focused mentorship. Looking forward to helping you too!

Found this helpful? Don’t forgot to clap 👏 and follow me for more such useful articles about Android development and Kotlin or buy us a coffee here ☕

If you need any help related to Mobile app development. I’m always happy to help you.

Follow me on:

LinkedIn, Github, Instagram , YouTube & WhatsApp

#Android #coroutines #kotlin