Optimizing Flask Application Performance

Flask is a lightweight and flexible web framework for building small—to medium-sized applications. It’s commonly used in projects ranging from simple personal blogs to more complex applications, such as REST APIs, SaaS platforms, e-commerce websites, and data-driven dashboards.

However, as your application scales in traffic or grows in complexity, you may begin to notice performance bottlenecks. Whether you’re building a content management system (CMS), an API for a mobile app, or a real-time data visualization tool, optimizing Flask’s performance becomes crucial to delivering a responsive and scalable user experience.

In this tutorial, you will explore various techniques and best practices to optimize a Flask application’s performance.

Prerequisites

  • A server running Ubuntu and a non-root user with sudo privileges and an active firewall. Please ensure to work with a supported version of Ubuntu.
  • Familiarity with the Linux command line.
  • A basic understanding of Python programming.
  • Python 3.7 or higher installed on your Ubuntu system.

Setting Up Your Flask Environment

Ubuntu 24.04 ships Python 3 by default. Open the terminal and run the following command to double-check the Python 3 installation:

root@ubuntu:~# python3 --version
Python 3.12.3

If Python 3 is already installed on your machine, the above command will return the current version of Python 3 installation. In case it is not installed, you can run the following command and get the Python 3 installation:

root@ubuntu:~# sudo apt install python3

Next, you need to install the pip package installer on your system:

root@ubuntu:~# sudo apt install python3-pip

Once pip is installed, let’s install Flask.

You will install Flask via pip. It’s recommended to do this in a virtual environment to avoid conflicts with other packages on your system.


root@ubuntu:~# python3 -m venv myprojectenv
root@ubuntu:~# source myprojectenv/bin/activate


root@ubuntu:~# pip install Flask

Create a Flask Application

The next step is to write the Python code for the Flask application. To create a new script, navigate to your directory of choice:

root@ubuntu:~# cd ~/path-to-your-script-directory

When inside the directory, create a new Python file, app.py, and import Flask. Then, initialize a Flask application and create a basic route.

root@ubuntu:~# nano app.py

This will open up a blank text editor. Write your logic here or copy the following code:

app.py

from flask import Flask, jsonify, request 

app = Flask(__name__) 

# Simulate a slow endpoint 
@app.route('/slow') 
def slow(): 
    import time 
    time.sleep(2) # to simulate a slow response 
    return jsonify(message="This request was slow!") 

# Simulate an intensive database operation 
@app.route('/db') 
def db_operation(): 
    # This is a dummy function to simulate a database query 
    result = {"name": "User", "email": "user@example.com"} 
    return jsonify(result) 

# Simulate a static file being served 
@app.route('/') 
def index(): 
    return "
return "<h1>Welcome to the Sample Flask App</h1>"

if __name__ == '__main__':
    app.run(debug=True)

Now, let’s run the Flask application:

Test the / Endpoint (Serves Static Content)

You can test the endpoints with the following curl commands:

root@ubuntu:~# curl http://127.0.0.1:5000/


Output

[secondary_lebel Output]
<h1>Welcome to the Sample Flask App</h1>%

Test the /slow Endpoint (Simulates a Slow Response)

root@ubuntu:~# time curl http://127.0.0.1:5000/slow

To check this slow endpoint we use the time command in Linux. The time command is used to measure the execution time of a given command or program. It provides three main pieces of information:

  • Real time: The actual elapsed time from start to finish of the command.
  • User time: The amount of CPU time spent in user mode.
  • System time: The amount of CPU time spent in kernel mode.

This will help us measure the actual time taken by our slow endpoint. The output might look something like this:
Output

{"message":"This request was slow!"} 
curl http://127.0.0.1:5000/slow 0.00s user 0.01s system 0% cpu 2.023 total

This request takes about 2 seconds to respond due to the time.sleep(2) call simulating a slow response.

Test the /db Endpoint (Simulates a Database Operation)


root@ubuntu:~# curl http://127.0.0.1:5000/db


Output

{"email":"user@example.com","name":"User"}

By testing these endpoints using curl, you can verify that your Flask application is running correctly and that the responses are as expected.

In the next section, you will learn to optimize the application’s performance using various techniques.

Use a Production-Ready WSGI Server

Flask’s built-in development server is not designed for production environments. To handle concurrent requests efficiently, you should switch to a production-ready WSGI server like Gunicorn.

Install and Set Up Gunicorn

Let’s install Gunicorn:

root@ubuntu:~# pip install gunicorn

Run the Flask application using Gunicorn with 4 worker processes:

root@ubuntu:~# gunicorn -w 4 -b 0.0.0.0:8000 app:app

Output

 % /Library/Python/3.9/bin/gunicorn -w 4 -b 0.0.0.0:8000 app:app 
[2024-09-13 18:37:24 +0530] [99925] [INFO] Starting gunicorn 23.0.0 
[2024-09-13 18:37:24 +0530] [99925] [INFO] Listening at: http://0.0.0.0:8000 (99925) 
[2024-09-13 18:37:24 +0530] [99925] [INFO] Using worker: sync 
[2024-09-13 18:37:24 +0530] [99926] [INFO] Booting worker with pid: 99926 
[2024-09-13 18:37:25 +0530] [99927] [INFO] Booting worker with pid: 99927 
[2024-09-13 18:37:25 +0530] [99928] [INFO] Booting worker with pid: 99928 
[2024-09-13 18:37:25 +0530] [99929] [INFO] Booting worker with pid: 99929 
[2024-09-13 18:37:37 +0530] [99925] [INFO] Handling signal: winch 
^C[2024-09-13 18:38:51 +0530] [99925] [INFO] Handling signal: int 
[2024-09-13 18:38:51 +0530] [99927] [INFO] Worker exiting (pid: 99927) 
[2024-09-13 18:38:51 +0530] [99926] [INFO] Worker exiting (pid: 99926) 
[2024-09-13 18:38:51 +0530] [99928] [INFO] Worker exiting (pid: 99928) 
[2024-09-13 18:38:51 +0530] [99929] [INFO] Worker exiting (pid: 99929) 
[2024-09-13 18:38:51 +0530] [99925] [INFO] Shutting down: Master

Benefits of Using Gunicorn

  • Concurrent Request Handling: Gunicorn allows multiple requests to be processed simultaneously by using multiple worker processes.
  • Load Balancing: It balances incoming requests across worker processes, ensuring optimal utilization of server resources.
  • Asynchronous Workers: With asynchronous workers like gevent, it can efficiently handle long-running tasks without blocking other requests.
  • Scalability: Gunicorn can scale horizontally by increasing the number of worker processes to handle more concurrent requests.
  • Fault Tolerance: It automatically replaces unresponsive or crashed workers, ensuring high availability.
  • Production-Ready: Unlike the Flask development server, Gunicorn is optimized for production environments with better security, stability, and performance features.

By switching to Gunicorn for production, you can significantly improve the throughput and responsiveness of your Flask application, making it ready to handle real-world traffic efficiently.

Enable Caching to Reduce Load

Caching is one of the best ways to improve Flask’s performance by reducing redundant processing. Here, you’ll add Flask-Caching to cache the result of the /slow route.

Install and Configure Flask-Caching with Redis

Install the necessary packages:

root@ubuntu:~# pip install Flask-Caching redis

Update app.py to add caching to the /slow route.

Open the editor and update the app.py file with the below:


root@ubuntu:~# nano app.py


app.py

from flask_caching import Cache 

app = Flask(__name__) 

# Configure Flask-Caching with Redis 
app.config['CACHE_TYPE'] = 'redis' 
app.config['CACHE_REDIS_HOST'] = 'localhost' 
app.config['CACHE_REDIS_PORT'] = 6379 

cache = Cache(app) 

@app.route('/slow') 
@cache.cached(timeout=60) 
def slow(): 
    import time 
    time.sleep(2) # Simulate a slow response 
    return jsonify(message="This request was slow!")

After the first request to /slow, subsequent requests within 60 seconds will be served from the cache, bypassing the time.sleep() function. This reduces the server load and speeds up response times.

Verify Cached Data

To verify if the data is being cached, let’s run the below commands for the /slow endpoint.

First Request to the /slow Endpoint

After this request completes, the result of the /slow route is cached.


root@ubuntu:~# time curl http://127.0.0.1:5000/slow


Output

{"message":"This request was slow!"} 
curl http://127.0.0.1:5000/slow 0.00s user 0.01s system 0% cpu 2.023 total

Subsequent Request to the /slow Endpoint within 60 Seconds


root@ubuntu:~# time curl http://127.0.0.1:5000/slow


Output

{"message":"This request was slow!"} 
curl http://127.0.0.1:5000/slow 0.00s user 0.00s system 0% cpu 0.015 total

Optimize Database Queries

Database queries can often become a performance bottleneck. In this section, you’ll simulate database query optimization using SQLAlchemy and connection pooling.

Simulate a Database Query with Connection Pooling

First, let’s install SQLAlchemy:

root@ubuntu:~# pip install Flask-SQLAlchemy

Update app.py to Configure Connection Pooling

app.py

from flask_sqlalchemy import SQLAlchemy 
from sqlalchemy import text 

# Simulate an intensive database operation 
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///test.db' 
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False 
app.config['SQLALCHEMY_POOL_SIZE'] = 5 # Connection pool size 

db = SQLAlchemy(app) 

@app.route('/db1') 
def db_operation_pooling(): 
    # Simulate a database query 
    result = db.session.execute(text('SELECT 1')).fetchall() 
    return jsonify(result=str(result))

Run a Curl Request to the /db1 Route

root@ubuntu:~# curl http://127.0.0.1:5000/db1

Output

Enable Gzip Compression

Compressing your responses can drastically reduce the amount of data transferred between your server and clients, improving performance.

Install and Configure Flask-Compress

Let’s install the Flask-Compress package:

root@ubuntu:~# pip install Flask-Compress

Update app.py to Enable Compression

app.py

from flask_compress import Compress 

# This enables Gzip compression for the Flask app 
# It compresses responses before sending them to clients, 
# reducing data transfer and improving performance 
Compress(app) 

@app.route('/compress')
def Compress():
    return "<h1>Welcome to the optimized Flask app !</h1>"

Offload Intensive Tasks to Celery

For resource-heavy operations like sending emails or processing large datasets, it’s best to offload them to background tasks using Celery. This prevents long-running tasks from blocking incoming requests.

Why Use Celery?

  • Improved response times: User requests complete faster by delegating heavy tasks to workers.
  • Better scalability: Celery can distribute tasks across multiple machines.
  • Handles complex tasks: Useful for large computations and long-running background tasks.
  • Task scheduling: Built-in support for periodic tasks and retries.
  • Works with message brokers: Integrates with Redis, RabbitMQ, and others for asynchronous processing.

By leveraging Celery, you can ensure that your Flask application remains responsive even when dealing with computationally intensive or I/O-bound tasks.

Set Up Celery for Background Tasks

Let’s install Celery:

root@ubuntu:~# pip install Celery

Update app.py to Configure Celery for Asynchronous Tasks

app.py


from celery import Celery

celery = Celery(app.name, broker='redis://localhost:6379/0')

@celery.task
def long_task():
    import time
    time.sleep(10)  # Simulate a long task
    return "Task Complete"

@app.route('/start-task')
def start_task():
    long_task.delay()
    return 'Task started'


In a separate terminal, start the Celery worker:

root@ubuntu:~# celery -A app.celery worker --loglevel=info

Output

------------- celery@your-computer-name v5.2.7 (dawn-chorus)
--- ***** ----- 
-- ******* ---- Linux-x.x.x-x-generic-x86_64-with-glibc2.xx 2023-xx-xx
- *** --- * --- 
- ** ---------- [config]
- ** ---------- .> app:         app:0x7f8b8c0b3cd0
- ** ---------- .> transport:   redis://localhost:6379/0
- ** ---------- .> results:     disabled://
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** ----- 
 -------------- [queues]
                .> celery           exchange=celery(direct) key=celery

[tasks]
  . app.long_task

[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] Connected to redis://localhost:6379/0
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] mingle: searching for neighbors
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] mingle: all alone
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] celery@your-computer-name ready.

Run a Curl Command to Trigger the Task

root@ubuntu:~# curl http://127.0.0.1:5000/start-task

Output

Understanding the start_task() Function

  • It calls long_task.delay(), which asynchronously starts the Celery task. This means the task is queued to run in the background, but the function doesn’t wait for it to complete.
  • It immediately returns the string 'Task started'.

The important thing to note is that the actual long-running task (simulated by the 10-second sleep) is executed asynchronously by Celery. The Flask route doesn’t wait for this task to complete before responding to the request.
After 10 Seconds, when the task is completed the output will be similar to this:

[2024-xx-xx xx:xx:xx,xxx: INFO/MainProcess] Task app.long_task[task-id] received
[2024-xx-xx xx:xx:xx,xxx: INFO/ForkPoolWorker-1] Task app.long_task[task-id] succeeded in 10.xxxs: 'Task Complete'

Celery in Production

In a production environment, implementing Celery involves:

  • Using a robust message broker like RabbitMQ.
  • Employing a dedicated result backend (e.g., PostgreSQL).
  • Managing workers with process control systems (e.g., Supervisor).
  • Implementing monitoring tools (e.g., Flower).
  • Enhancing error handling and logging.
  • Utilizing task prioritization.
  • Scaling with multiple workers across different machines.
  • Ensuring proper security measures.

Conclusion

In this tutorial, you learned how to optimize a Flask application by implementing various performance-enhancing techniques. By following these steps, you can improve the performance, scalability, and responsiveness of your Flask application, ensuring it runs efficiently even under heavy load.

Create a Free Account

Register now and get access to our Cloud Services.

Posts you might be interested in: