Publishing data¶
Datasette includes tools for publishing and deploying your data to the internet. The datasette publish
command will deploy a new Datasette instance containing your databases directly to a Heroku, Google Cloud or Zeit Now hosting account. You can also use datasette package
to create a Docker image that bundles your databases together with the datasette application that is used to serve them.
datasette publish¶
Once you have created a SQLite database (e.g. using csvs-to-sqlite) you can deploy it to a hosting account using a single command.
You will need a hosting account with Heroku or Google Cloud. Once you have created your account you will need to install and configure the heroku
or gcloud
command-line tools.
Publishing to Heroku¶
To publish your data using Heroku, first create an account there and install and configure the Heroku CLI tool.
You can publish a database to Heroku using the following command:
datasette publish heroku mydatabase.db
This will output some details about the new deployment, including a URL like this one:
https://limitless-reef-88278.herokuapp.com/ deployed to Heroku
You can specify a custom app name by passing -n my-app-name
to the publish command. This will also allow you to overwrite an existing app.
$ datasette publish heroku --help
Usage: datasette publish heroku [OPTIONS] [FILES]...
Options:
-m, --metadata FILENAME Path to JSON file containing metadata to publish
--extra-options TEXT Extra options to pass to datasette serve
--branch TEXT Install datasette from a GitHub branch e.g. master
--template-dir DIRECTORY Path to directory containing custom templates
--plugins-dir DIRECTORY Path to directory containing custom plugins
--static MOUNT:DIRECTORY Serve static files from this directory at /MOUNT/...
--install TEXT Additional packages (e.g. plugins) to install
--plugin-secret <TEXT TEXT TEXT>...
Secrets to pass to plugins, e.g. --plugin-secret
datasette-auth-github client_id xxx
--version-note TEXT Additional note to show on /-/versions
--title TEXT Title for metadata
--license TEXT License label for metadata
--license_url TEXT License URL for metadata
--source TEXT Source label for metadata
--source_url TEXT Source URL for metadata
--about TEXT About label for metadata
--about_url TEXT About URL for metadata
-n, --name TEXT Application name to use when deploying
--help Show this message and exit.
Publishing to Google Cloud Run¶
Google Cloud Run launched as a GA in in November 2019. It allows you to publish data in a scale-to-zero environment, so your application will start running when the first request is received and will shut down again when traffic ceases. This means you only pay for time spent serving traffic.
You will first need to install and configure the Google Cloud CLI tools by following these instructions.
You can then publish a database to Google Cloud Run using the following command:
datasette publish cloudrun mydatabase.db --service=my-database
A Cloud Run service is a single hosted application. The service name you specify will be used as part of the Cloud Run URL. If you deploy to a service name that you have used in the past your new deployment will replace the previous one.
If you omit the --service
option you will be asked to pick a service name interactively during the deploy.
You may need to interact with prompts from the tool. Once it has finished it will output a URL like this one:
Service [my-service] revision [my-service-00001] has been deployed
and is serving traffic at https://my-service-j7hipcg4aq-uc.a.run.app
$ datasette publish cloudrun --help
Usage: datasette publish cloudrun [OPTIONS] [FILES]...
Options:
-m, --metadata FILENAME Path to JSON file containing metadata to publish
--extra-options TEXT Extra options to pass to datasette serve
--branch TEXT Install datasette from a GitHub branch e.g. master
--template-dir DIRECTORY Path to directory containing custom templates
--plugins-dir DIRECTORY Path to directory containing custom plugins
--static MOUNT:DIRECTORY Serve static files from this directory at /MOUNT/...
--install TEXT Additional packages (e.g. plugins) to install
--plugin-secret <TEXT TEXT TEXT>...
Secrets to pass to plugins, e.g. --plugin-secret
datasette-auth-github client_id xxx
--version-note TEXT Additional note to show on /-/versions
--title TEXT Title for metadata
--license TEXT License label for metadata
--license_url TEXT License URL for metadata
--source TEXT Source label for metadata
--source_url TEXT Source URL for metadata
--about TEXT About label for metadata
--about_url TEXT About URL for metadata
-n, --name TEXT Application name to use when building
--service TEXT Cloud Run service to deploy (or over-write)
--spatialite Enable SpatialLite extension
--show-files Output the generated Dockerfile and metadata.json
--memory TEXT Memory to allocate in Cloud Run, e.g. 1Gi
--help Show this message and exit.
Publishing to Zeit Now v1¶
Datasette can be deployed to Zeit Now’s older v1 hosting platform. They no longer accept new signups for this service, so this option is currently only available if you created an account before January 2019.
To publish your database(s) to a new instance hosted by Zeit Now v1, install the now cli tool and then run the following command:
datasette publish nowv1 mydatabase.db
This will upload your database to Zeit Now, assign you a new URL and install and start a new instance of Datasette to serve your database.
The command will output a URL that looks something like this:
https://datasette-elkksjmyfj.now.sh
You can navigate to this URL to see live logs of the deployment process. Your new Datasette instance will be available at that URL.
Once the deployment has completed, you can assign a custom URL to your instance using the now alias
command:
now alias https://datasette-elkksjmyfj.now.sh datasette-publish-demo.now.sh
You can use anything-you-like.now.sh
, provided no one else has already registered that alias.
You can also use custom domains, if you first register them with Zeit Now.
$ datasette publish nowv1 --help
Usage: datasette publish nowv1 [OPTIONS] [FILES]...
Options:
-m, --metadata FILENAME Path to JSON file containing metadata to publish
--extra-options TEXT Extra options to pass to datasette serve
--branch TEXT Install datasette from a GitHub branch e.g. master
--template-dir DIRECTORY Path to directory containing custom templates
--plugins-dir DIRECTORY Path to directory containing custom plugins
--static MOUNT:DIRECTORY Serve static files from this directory at /MOUNT/...
--install TEXT Additional packages (e.g. plugins) to install
--plugin-secret <TEXT TEXT TEXT>...
Secrets to pass to plugins, e.g. --plugin-secret
datasette-auth-github client_id xxx
--version-note TEXT Additional note to show on /-/versions
--title TEXT Title for metadata
--license TEXT License label for metadata
--license_url TEXT License URL for metadata
--source TEXT Source label for metadata
--source_url TEXT Source URL for metadata
--about TEXT About label for metadata
--about_url TEXT About URL for metadata
-n, --name TEXT Application name to use when deploying
--force Pass --force option to now
--token TEXT Auth token to use for deploy
--alias TEXT Desired alias e.g. yoursite.now.sh
--spatialite Enable SpatialLite extension
--show-files Output the generated Dockerfile and metadata.json
--help Show this message and exit.
Custom metadata and plugins¶
datasette publish
accepts a number of additional options which can be used to further customize your Datasette instance.
You can define your own Metadata and deploy that with your instance like so:
datasette publish cloudrun --service=my-service mydatabase.db -m metadata.json
If you just want to set the title, license or source information you can do that directly using extra options to datasette publish
:
datasette publish cloudrun mydatabase.db --service=my-service \
--title="Title of my database" \
--source="Where the data originated" \
--source_url="http://www.example.com/"
You can also specify plugins you would like to install. For example, if you want to include the datasette-vega visualization plugin you can use the following:
datasette publish cloudrun mydatabase.db --service=my-service --install=datasette-vega
If a plugin has any Secret configuration values you can use the --plugin-secret
option to set those secrets at publish time. For example, using Heroku with datasette-auth-github you might run the following command:
$ datasette publish heroku my_database.db \
--name my-heroku-app-demo \
--install=datasette-auth-github \
--plugin-secret datasette-auth-github client_id your_client_id \
--plugin-secret datasette-auth-github client_secret your_client_secret
datasette package¶
If you have docker installed (e.g. using Docker for Mac) you can use the datasette package
command to create a new Docker image in your local repository containing the datasette app bundled together with your selected SQLite databases:
datasette package mydatabase.db
Here’s example output for the package command:
$ datasette package parlgov.db --extra-options="--config sql_time_limit_ms:2500"
Sending build context to Docker daemon 4.459MB
Step 1/7 : FROM python:3
---> 79e1dc9af1c1
Step 2/7 : COPY . /app
---> Using cache
---> cd4ec67de656
Step 3/7 : WORKDIR /app
---> Using cache
---> 139699e91621
Step 4/7 : RUN pip install datasette
---> Using cache
---> 340efa82bfd7
Step 5/7 : RUN datasette inspect parlgov.db --inspect-file inspect-data.json
---> Using cache
---> 5fddbe990314
Step 6/7 : EXPOSE 8001
---> Using cache
---> 8e83844b0fed
Step 7/7 : CMD datasette serve parlgov.db --port 8001 --inspect-file inspect-data.json --config sql_time_limit_ms:2500
---> Using cache
---> 1bd380ea8af3
Successfully built 1bd380ea8af3
You can now run the resulting container like so:
docker run -p 8081:8001 1bd380ea8af3
This exposes port 8001 inside the container as port 8081 on your host machine, so you can access the application at http://localhost:8081/
You can customize the port that is exposed by the countainer using the --port
option:
datasette package mydatabase.db –port 8080
A full list of options can be seen by running datasette package --help
:
$ datasette package --help
Usage: datasette package [OPTIONS] FILES...
Package specified SQLite files into a new datasette Docker container
Options:
-t, --tag TEXT Name for the resulting Docker container, can optionally use
name:tag format
-m, --metadata FILENAME Path to JSON file containing metadata to publish
--extra-options TEXT Extra options to pass to datasette serve
--branch TEXT Install datasette from a GitHub branch e.g. master
--template-dir DIRECTORY Path to directory containing custom templates
--plugins-dir DIRECTORY Path to directory containing custom plugins
--static MOUNT:DIRECTORY Serve static files from this directory at /MOUNT/...
--install TEXT Additional packages (e.g. plugins) to install
--spatialite Enable SpatialLite extension
--version-note TEXT Additional note to show on /-/versions
-p, --port INTEGER Port to run the server on, defaults to 8001
--title TEXT Title for metadata
--license TEXT License label for metadata
--license_url TEXT License URL for metadata
--source TEXT Source label for metadata
--source_url TEXT Source URL for metadata
--about TEXT About label for metadata
--about_url TEXT About URL for metadata
--help Show this message and exit.