Optimizing Flask Application Performance
Flask is a lightweight and flexible web framework for building small—to medium-sized applications. It’s commonly used in projects ranging from simple personal blogs to more complex applications, such as REST APIs, SaaS platforms, e-commerce websites, and data-driven dashboards.
However, as your application scales in traffic or grows in complexity, you may begin to notice performance bottlenecks. Whether you’re building a content management system (CMS), an API for a mobile app, or a real-time data visualization tool, optimizing Flask’s performance becomes crucial to delivering a responsive and scalable user experience.
In this tutorial, you will explore various techniques and best practices to optimize a Flask application’s performance.
Prerequisites
- A server running Ubuntu and a non-root user with sudo privileges and an active firewall. Please ensure to work with a supported version of Ubuntu.
- Familiarity with the Linux command line.
- A basic understanding of Python programming.
- Python 3.7 or higher installed on your Ubuntu system.
Setting Up Your Flask Environment
Ubuntu 24.04 ships Python 3 by default. Open the terminal and run the following command to double-check the Python 3 installation:
root@ubuntu:~# python3 --version
Python 3.12.3
If Python 3 is already installed on your machine, the above command will return the current version of Python 3 installation. In case it is not installed, you can run the following command and get the Python 3 installation:
root@ubuntu:~# sudo apt install python3
Next, you need to install the pip package installer on your system:
root@ubuntu:~# sudo apt install python3-pip
Once pip is installed, let’s install Flask.
You will install Flask via pip. It’s recommended to do this in a virtual environment to avoid conflicts with other packages on your system.
root@ubuntu:~# python3 -m venv myprojectenv
root@ubuntu:~# source myprojectenv/bin/activate
root@ubuntu:~# pip install Flask
Create a Flask Application
The next step is to write the Python code for the Flask application. To create a new script, navigate to your directory of choice:
root@ubuntu:~# cd ~/path-to-your-script-directory
When inside the directory, create a new Python file, app.py
, and import Flask. Then, initialize a Flask application and create a basic route.
root@ubuntu:~# nano app.py
This will open up a blank text editor. Write your logic here or copy the following code:
app.py
from flask import Flask, jsonify, request
app = Flask(__name__)
# Simulate a slow endpoint
@app.route('/slow')
def slow():
import time
time.sleep(2) # to simulate a slow response
return jsonify(message="This request was slow!")
# Simulate an intensive database operation
@app.route('/db')
def db_operation():
# This is a dummy function to simulate a database query
result = {"name": "User", "email": "user@example.com"}
return jsonify(result)
# Simulate a static file being served
@app.route('/')
def index():
return "
return "<h1>Welcome to the Sample Flask App</h1>"
if __name__ == '__main__':
app.run(debug=True)
Now, let’s run the Flask application:
root@ubuntu:~# flask run
Test the /
Endpoint (Serves Static Content)
You can test the endpoints with the following curl commands:
root@ubuntu:~# curl http://127.0.0.1:5000/
Output
[secondary_lebel Output]
<h1>Welcome to the Sample Flask App</h1>%
Test the /slow
Endpoint (Simulates a Slow Response)
root@ubuntu:~# time curl http://127.0.0.1:5000/slow
To check this slow endpoint we use the time
command in Linux. The time
command is used to measure the execution time of a given command or program. It provides three main pieces of information:
- Real time: The actual elapsed time from start to finish of the command.
- User time: The amount of CPU time spent in user mode.
- System time: The amount of CPU time spent in kernel mode.
This will help us measure the actual time taken by our slow endpoint. The output might look something like this:
Output
{"message":"This request was slow!"}
curl http://127.0.0.1:5000/slow 0.00s user 0.01s system 0% cpu 2.023 total
This request takes about 2 seconds to respond due to the time.sleep(2)
call simulating a slow response.
Test the /db
Endpoint (Simulates a Database Operation)
root@ubuntu:~# curl http://127.0.0.1:5000/db
Output
{"email":"user@example.com","name":"User"}
By testing these endpoints using curl
, you can verify that your Flask application is running correctly and that the responses are as expected.
In the next section, you will learn to optimize the application’s performance using various techniques.
Use a Production-Ready WSGI Server
Flask’s built-in development server is not designed for production environments. To handle concurrent requests efficiently, you should switch to a production-ready WSGI server like Gunicorn.
Install and Set Up Gunicorn
Let’s install Gunicorn:
root@ubuntu:~# pip install gunicorn
Run the Flask application using Gunicorn with 4 worker processes:
root@ubuntu:~# gunicorn -w 4 -b 0.0.0.0:8000 app:app
Output
% /Library/Python/3.9/bin/gunicorn -w 4 -b 0.0.0.0:8000 app:app
[2024-09-13 18:37:24 +0530] [99925] [INFO] Starting gunicorn 23.0.0
[2024-09-13 18:37:24 +0530] [99925] [INFO] Listening at: http://0.0.0.0:8000 (99925)
[2024-09-13 18:37:24 +0530] [99925] [INFO] Using worker: sync
[2024-09-13 18:37:24 +0530] [99926] [INFO] Booting worker with pid: 99926
[2024-09-13 18:37:25 +0530] [99927] [INFO] Booting worker with pid: 99927
[2024-09-13 18:37:25 +0530] [99928] [INFO] Booting worker with pid: 99928
[2024-09-13 18:37:25 +0530] [99929] [INFO] Booting worker with pid: 99929
[2024-09-13 18:37:37 +0530] [99925] [INFO] Handling signal: winch
^C[2024-09-13 18:38:51 +0530] [99925] [INFO] Handling signal: int
[2024-09-13 18:38:51 +0530] [99927] [INFO] Worker exiting (pid: 99927)
[2024-09-13 18:38:51 +0530] [99926] [INFO] Worker exiting (pid: 99926)
[2024-09-13 18:38:51 +0530] [99928] [INFO] Worker exiting (pid: 99928)
[2024-09-13 18:38:51 +0530] [99929] [INFO] Worker exiting (pid: 99929)
[2024-09-13 18:38:51 +0530] [99925] [INFO] Shutting down: Master
Benefits of Using Gunicorn
- Concurrent Request Handling: Gunicorn allows multiple requests to be processed simultaneously by using multiple worker processes.
- Load Balancing: It balances incoming requests across worker processes, ensuring optimal utilization of server resources.
- Asynchronous Workers: With asynchronous workers like gevent, it can efficiently handle long-running tasks without blocking other requests.
- Scalability: Gunicorn can scale horizontally by increasing the number of worker processes to handle more concurrent requests.
- Fault Tolerance: It automatically replaces unresponsive or crashed workers, ensuring high availability.
- Production-Ready: Unlike the Flask development server, Gunicorn is optimized for production environments with better security, stability, and performance features.
By switching to Gunicorn for production, you can significantly improve the throughput and responsiveness of your Flask application, making it ready to handle real-world traffic efficiently.
Enable Caching to Reduce Load
Caching is one of the best ways to improve Flask’s performance by reducing redundant processing. Here, you’ll add Flask-Caching to cache the result of the /slow
route.
Install and Configure Flask-Caching with Redis
Install the necessary packages:
root@ubuntu:~# pip install Flask-Caching redis
Update app.py
to add caching to the /slow
route.
Open the editor and update the app.py
file with the below:
root@ubuntu:~# nano app.py
app.py
from flask_caching import Cache
app = Flask(__name__)
# Configure Flask-Caching with Redis
app.config['CACHE_TYPE'] = 'redis'
app.config['CACHE_REDIS_HOST'] = 'localhost'
app.config['CACHE_REDIS_PORT'] = 6379
cache = Cache(app)
@app.route('/slow')
@cache.cached(timeout=60)
def slow():
import time
time.sleep(2) # Simulate a slow response
return jsonify(message="This request was slow!")
After the first request to /slow
, subsequent requests within 60 seconds will be served from the cache, bypassing the time.sleep()
function. This reduces the server load and speeds up response times.
Verify Cached Data
To verify if the data is being cached, let’s run the below commands for the /slow
endpoint.
First Request to the /slow
Endpoint
After this request completes, the result of the /slow
route is cached.
root@ubuntu:~# time curl http://127.0.0.1:5000/slow
Output
{"message":"This request was slow!"}
curl http://127.0.0.1:5000/slow 0.00s user 0.01s system 0% cpu 2.023 total
Subsequent Request to the /slow
Endpoint within 60 Seconds
root@ubuntu:~# time curl http://127.0.0.1:5000/slow
Output
{"message":"This request was slow!"}
curl http://127.0.0.1:5000/slow 0.00s user 0.00s system 0% cpu 0.015 total
Optimize Database Queries
Database queries can often become a performance bottleneck. In this section, you’ll simulate database query optimization using SQLAlchemy and connection pooling.
Simulate a Database Query with Connection Pooling
First, let’s install SQLAlchemy:
root@ubuntu:~# pip install Flask-SQLAlchemy
Update app.py
to Configure Connection Pooling
app.py
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy import text
# Simulate an intensive database operation
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///test.db'
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
app.config['SQLALCHEMY_POOL_SIZE'] = 5 # Connection pool size
db = SQLAlchemy(app)
@app.route('/db1')
def db_operation_pooling():
# Simulate a database query
result = db.session.execute(text('SELECT 1')).fetchall()
return jsonify(result=str(result))
Run a Curl Request to the /db1
Route
root@ubuntu:~# curl http://127.0.0.1:5000/db1
Output
{"result":"[(1,)]"}
Enable Gzip Compression
Compressing your responses can drastically reduce the amount of data transferred between your server and clients, improving performance.
Install and Configure Flask-Compress
Let’s install the Flask-Compress package:
root@ubuntu:~# pip install Flask-Compress
Update app.py
to Enable Compression
app.py
from flask_compress import Compress
# This enables Gzip compression for the Flask app
# It compresses responses before sending them to clients,
# reducing data transfer and improving performance
Compress(app)
@app.route('/compress')
def Compress():
return "<h1>Welcome to the optimized Flask app !</h1>"
Offload Intensive Tasks to Celery
For resource-heavy operations like sending emails or processing large datasets, it’s best to offload them to background tasks using Celery. This prevents long-running tasks from blocking incoming requests.
Why Use Celery?
- Improved response times: User requests complete faster by delegating heavy tasks to workers.
- Better scalability: Celery can distribute tasks across multiple machines.
- Handles complex tasks: Useful for large computations and long-running background tasks.
- Task scheduling: Built-in support for periodic tasks and retries.
- Works with message brokers: Integrates with Redis, RabbitMQ, and others for asynchronous processing.
By leveraging Celery, you can ensure that your Flask application remains responsive even when dealing with computationally intensive or I/O-bound tasks.
Set Up Celery for Background Tasks
Let’s install Celery:
root@ubuntu:~# pip install Celery
Update app.py
to Configure Celery for Asynchronous Tasks
app.py
from celery import Celery
celery = Celery(app.name, broker='redis://localhost:6379/0')
@celery.task
def long_task():
import time
time.sleep(10) # Simulate a long task
return "Task Complete"
@app.route('/start-task')
def start_task():
long_task.delay()
return 'Task started'
In a separate terminal, start the Celery worker:
root@ubuntu:~# celery -A app.celery worker --loglevel=info
Output
------------- celery@your-computer-name v5.2.7 (dawn-chorus)
--- ***** -----
-- ******* ---- Linux-x.x.x-x-generic-x86_64-with-glibc2.xx 2023-xx-xx
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: app:0x7f8b8c0b3cd0
- ** ---------- .> transport: redis://localhost:6379/0
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. app.long_task
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] Connected to redis://localhost:6379/0
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] mingle: searching for neighbors
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] mingle: all alone
[2023-xx-xx xx:xx:xx,xxx: INFO/MainProcess] celery@your-computer-name ready.
Run a Curl Command to Trigger the Task
root@ubuntu:~# curl http://127.0.0.1:5000/start-task
Output
Task started
Understanding the start_task()
Function
- It calls
long_task.delay()
, which asynchronously starts the Celery task. This means the task is queued to run in the background, but the function doesn’t wait for it to complete. - It immediately returns the string
'Task started'
.
The important thing to note is that the actual long-running task (simulated by the 10-second sleep) is executed asynchronously by Celery. The Flask route doesn’t wait for this task to complete before responding to the request.
After 10 Seconds, when the task is completed the output will be similar to this:
[2024-xx-xx xx:xx:xx,xxx: INFO/MainProcess] Task app.long_task[task-id] received
[2024-xx-xx xx:xx:xx,xxx: INFO/ForkPoolWorker-1] Task app.long_task[task-id] succeeded in 10.xxxs: 'Task Complete'
Celery in Production
In a production environment, implementing Celery involves:
- Using a robust message broker like RabbitMQ.
- Employing a dedicated result backend (e.g., PostgreSQL).
- Managing workers with process control systems (e.g., Supervisor).
- Implementing monitoring tools (e.g., Flower).
- Enhancing error handling and logging.
- Utilizing task prioritization.
- Scaling with multiple workers across different machines.
- Ensuring proper security measures.
Conclusion
In this tutorial, you learned how to optimize a Flask application by implementing various performance-enhancing techniques. By following these steps, you can improve the performance, scalability, and responsiveness of your Flask application, ensuring it runs efficiently even under heavy load.