Deploying Datasette¶
The quickest way to deploy a Datasette instance on the internet is to use the datasette publish
command, described in Publishing data. This can be used to quickly deploy Datasette to a number of hosting providers including Heroku, Google Cloud Run and Vercel.
You can deploy Datasette to other hosting providers using the instructions on this page.
Deployment fundamentals¶
Datasette can be deployed as a single datasette
process that listens on a port. Datasette is not designed to be run as root, so that process should listen on a higher port such as port 8000.
If you want to serve Datasette on port 80 (the HTTP default port) or port 443 (for HTTPS) you should run it behind a proxy server, such as nginx, Apache or HAProxy. The proxy server can listen on port 80/443 and forward traffic on to Datasette.
Running Datasette using systemd¶
You can run Datasette on Ubuntu or Debian systems using systemd
.
First, ensure you have Python 3 and pip
installed. On Ubuntu you can use sudo apt-get install python3 python3-pip
.
You can install Datasette into a virtual environment, or you can install it system-wide. To install system-wide, use sudo pip3 install datasette
.
Now create a folder for your Datasette databases, for example using mkdir /home/ubuntu/datasette-root
.
You can copy a test database into that folder like so:
cd /home/ubuntu/datasette-root
curl -O https://latest.datasette.io/fixtures.db
Create a file at /etc/systemd/system/datasette.service
with the following contents:
[Unit]
Description=Datasette
After=network.target
[Service]
Type=simple
User=ubuntu
Environment=DATASETTE_SECRET=
WorkingDirectory=/home/ubuntu/datasette-root
ExecStart=datasette serve . -h 127.0.0.1 -p 8000
Restart=on-failure
[Install]
WantedBy=multi-user.target
Add a random value for the DATASETTE_SECRET
- this will be used to sign Datasette cookies such as the CSRF token cookie. You can generate a suitable value like so:
python3 -c 'import secrets; print(secrets.token_hex(32))'
This configuration will run Datasette against all database files contained in the /home/ubuntu/datasette-root
directory. If that directory contains a metadata.yml
(or .json
) file or a templates/
or plugins/
sub-directory those will automatically be loaded by Datasette - see Configuration directory mode for details.
You can start the Datasette process running using the following:
sudo systemctl daemon-reload
sudo systemctl start datasette.service
You will need to restart the Datasette service after making changes to its metadata.json
configuration or adding a new database file to that directory. You can do that using:
sudo systemctl restart datasette.service
Once the service has started you can confirm that Datasette is running on port 8000 like so:
curl 127.0.0.1:8000/-/versions.json
# Should output JSON showing the installed version
Datasette will not be accessible from outside the server because it is listening on 127.0.0.1
. You can expose it by instead listening on 0.0.0.0
, but a better way is to set up a proxy such as nginx
- see Running Datasette behind a proxy.
Running Datasette using OpenRC¶
OpenRC is the service manager on non-systemd Linux distributions like Alpine Linux and Gentoo.
Create an init script at /etc/init.d/datasette
with the following contents:
#!/sbin/openrc-run
name="datasette"
command="datasette"
command_args="serve -h 0.0.0.0 /path/to/db.db"
command_background=true
pidfile="/run/${RC_SVCNAME}.pid"
You then need to configure the service to run at boot and start it:
rc-update add datasette
rc-service datasette start
Deploying using buildpacks¶
Some hosting providers such as Heroku, DigitalOcean App Platform and Scalingo support the Buildpacks standard for deploying Python web applications.
Deploying Datasette on these platforms requires two files: requirements.txt
and Procfile
.
The requirements.txt
file lets the platform know which Python packages should be installed. It should contain datasette
at a minimum, but can also list any Datasette plugins you wish to install - for example:
datasette
datasette-vega
The Procfile
lets the hosting platform know how to run the command that serves web traffic. It should look like this:
web: datasette . -h 0.0.0.0 -p $PORT --cors
The $PORT
environment variable is provided by the hosting platform. --cors
enables CORS requests from JavaScript running on other websites to your domain - omit this if you don't want to allow CORS. You can add additional Datasette Settings options here too.
These two files should be enough to deploy Datasette on any host that supports buildpacks. Datasette will serve any SQLite files that are included in the root directory of the application.
If you want to build SQLite files or download them as part of the deployment process you can do so using a bin/post_compile
file. For example, the following bin/post_compile
will download an example database that will then be served by Datasette:
wget https://fivethirtyeight.datasettes.com/fivethirtyeight.db
simonw/buildpack-datasette-demo is an example GitHub repository showing a Datasette configuration that can be deployed to a buildpack-supporting host.
Running Datasette behind a proxy¶
You may wish to run Datasette behind an Apache or nginx proxy, using a path within your existing site.
You can use the base_url configuration setting to tell Datasette to serve traffic with a specific URL prefix. For example, you could run Datasette like this:
datasette my-database.db --setting base_url /my-datasette/ -p 8009
This will run Datasette with the following URLs:
http://127.0.0.1:8009/my-datasette/
- the Datasette homepagehttp://127.0.0.1:8009/my-datasette/my-database
- the page for themy-database.db
databasehttp://127.0.0.1:8009/my-datasette/my-database/some_table
- the page for thesome_table
table
You can now set your nginx or Apache server to proxy the /my-datasette/
path to this Datasette instance.
Nginx proxy configuration¶
Here is an example of an nginx configuration file that will proxy traffic to Datasette:
daemon off;
events {
worker_connections 1024;
}
http {
server {
listen 80;
location /my-datasette {
proxy_pass http://127.0.0.1:8009/my-datasette;
proxy_set_header Host $host;
}
}
}
You can also use the --uds
option to Datasette to listen on a Unix domain socket instead of a port, configuring the nginx upstream proxy like this:
daemon off;
events {
worker_connections 1024;
}
http {
server {
listen 80;
location /my-datasette {
proxy_pass http://datasette/my-datasette;
proxy_set_header Host $host;
}
}
upstream datasette {
server unix:/tmp/datasette.sock;
}
}
Then run Datasette with datasette --uds /tmp/datasette.sock path/to/database.db --setting base_url /my-datasette/
.
Apache proxy configuration¶
For Apache, you can use the ProxyPass
directive. First make sure the following lines are uncommented:
LoadModule proxy_module lib/httpd/modules/mod_proxy.so
LoadModule proxy_http_module lib/httpd/modules/mod_proxy_http.so
Then add these directives to proxy traffic:
ProxyPass /my-datasette/ http://127.0.0.1:8009/my-datasette/
ProxyPreserveHost On
A live demo of Datasette running behind Apache using this proxy setup can be seen at datasette-apache-proxy-demo.datasette.io/prefix/. The code for that demo can be found in the demos/apache-proxy directory.
Using --uds
you can use Unix domain sockets similar to the nginx example:
ProxyPass /my-datasette/ unix:/tmp/datasette.sock|http://localhost/my-datasette/
The ProxyPreserveHost On directive ensures that the original Host:
header from the incoming request is passed through to Datasette. Datasette needs this to correctly assemble links to other pages using the .absolute_url(request, path) method.