--- /dev/null
-"Office" documents (such as ".doc", ".xlsx" and ".odt"). If you
-wish to use this, you must provide a Tika server and a Gotenberg server,
+ # Configuration
+
+ Paperless provides a wide range of customizations. Depending on how you
+ run paperless, these settings have to be defined in different places.
+
+ - If you run paperless on docker, `paperless.conf` is not used.
+ Rather, configure paperless by copying necessary options to
+ `docker-compose.env`.
+
+ - If you are running paperless on anything else, paperless will search
+ for the configuration file in these locations and use the first one
+ it finds:
+
+ ```
+ /path/to/paperless/paperless.conf
+ /etc/paperless.conf
+ /usr/local/etc/paperless.conf
+ ```
+
+ ## Required services
+
+ `PAPERLESS_REDIS=<url>`
+
+ : This is required for processing scheduled tasks such as email
+ fetching, index optimization and for training the automatic document
+ matcher.
+
+ - If your Redis server needs login credentials PAPERLESS_REDIS =
+ `redis://<username>:<password>@<host>:<port>`
+ - With the requirepass option PAPERLESS_REDIS =
+ `redis://:<password>@<host>:<port>`
+
+ [More information on securing your Redis
+ Instance](https://redis.io/docs/getting-started/#securing-redis).
+
+ Defaults to <redis://localhost:6379>.
+
+ `PAPERLESS_DBENGINE=<engine_name>`
+
+ : Optional, gives the ability to choose Postgres or MariaDB for
+ database engine. Available options are [postgresql]{.title-ref} and
+ [mariadb]{.title-ref}.
+
+ Default is [postgresql]{.title-ref}.
+
+ !!! warning
+
+ Using MariaDB comes with some caveats. See [MySQL Caveats](advanced_usage#mysql-caveats).
+
+ `PAPERLESS_DBHOST=<hostname>`
+
+ : By default, sqlite is used as the database backend. This can be
+ changed here.
+
+ Set PAPERLESS_DBHOST and another database will be used instead of
+ sqlite.
+
+ `PAPERLESS_DBPORT=<port>`
+
+ : Adjust port if necessary.
+
+ Default is 5432.
+
+ `PAPERLESS_DBNAME=<name>`
+
+ : Database name in PostgreSQL or MariaDB.
+
+ Defaults to "paperless".
+
+ `PAPERLESS_DBUSER=<name>`
+
+ : Database user in PostgreSQL or MariaDB.
+
+ Defaults to "paperless".
+
+ `PAPERLESS_DBPASS=<password>`
+
+ : Database password for PostgreSQL or MariaDB.
+
+ Defaults to "paperless".
+
+ `PAPERLESS_DBSSLMODE=<mode>`
+
+ : SSL mode to use when connecting to PostgreSQL.
+
+ See [the official documentation about
+ sslmode](https://www.postgresql.org/docs/current/libpq-ssl.html).
+
+ Default is `prefer`.
+
+ `PAPERLESS_DB_TIMEOUT=<float>`
+
+ : Amount of time for a database connection to wait for the database to
+ unlock. Mostly applicable for an sqlite based installation, consider
+ changing to postgresql if you need to increase this.
+
+ Defaults to unset, keeping the Django defaults.
+
+ ## Paths and folders
+
+ `PAPERLESS_CONSUMPTION_DIR=<path>`
+
+ : This where your documents should go to be consumed. Make sure that
+ it exists and that the user running the paperless service can
+ read/write its contents before you start Paperless.
+
+ Don't change this when using docker, as it only changes the path
+ within the container. Change the local consumption directory in the
+ docker-compose.yml file instead.
+
+ Defaults to "../consume/", relative to the "src" directory.
+
+ `PAPERLESS_DATA_DIR=<path>`
+
+ : This is where paperless stores all its data (search index, SQLite
+ database, classification model, etc).
+
+ Defaults to "../data/", relative to the "src" directory.
+
+ `PAPERLESS_TRASH_DIR=<path>`
+
+ : Instead of removing deleted documents, they are moved to this
+ directory.
+
+ This must be writeable by the user running paperless. When running
+ inside docker, ensure that this path is within a permanent volume
+ (such as "../media/trash") so it won't get lost on upgrades.
+
+ Defaults to empty (i.e. really delete documents).
+
+ `PAPERLESS_MEDIA_ROOT=<path>`
+
+ : This is where your documents and thumbnails are stored.
+
+ You can set this and PAPERLESS_DATA_DIR to the same folder to have
+ paperless store all its data within the same volume.
+
+ Defaults to "../media/", relative to the "src" directory.
+
+ `PAPERLESS_STATICDIR=<path>`
+
+ : Override the default STATIC_ROOT here. This is where all static
+ files created using "collectstatic" manager command are stored.
+
+ Unless you're doing something fancy, there is no need to override
+ this.
+
+ Defaults to "../static/", relative to the "src" directory.
+
+ `PAPERLESS_FILENAME_FORMAT=<format>`
+
+ : Changes the filenames paperless uses to store documents in the media
+ directory. See [File name handling](advanced_usage#file_name_handling) for details.
+
+ Default is none, which disables this feature.
+
+ `PAPERLESS_FILENAME_FORMAT_REMOVE_NONE=<bool>`
+
+ : Tells paperless to replace placeholders in
+ [PAPERLESS_FILENAME_FORMAT]{.title-ref} that would resolve to
+ 'none' to be omitted from the resulting filename. This also holds
+ true for directory names. See [File name handling](advanced_usage#file_name_handling) for
+ details.
+
+ Defaults to [false]{.title-ref} which disables this feature.
+
+ `PAPERLESS_LOGGING_DIR=<path>`
+
+ : This is where paperless will store log files.
+
+ Defaults to "`PAPERLESS_DATA_DIR`/log/".
+
+ ## Logging
+
+ `PAPERLESS_LOGROTATE_MAX_SIZE=<num>`
+
+ : Maximum file size for log files before they are rotated, in bytes.
+
+ Defaults to 1 MiB.
+
+ `PAPERLESS_LOGROTATE_MAX_BACKUPS=<num>`
+
+ : Number of rotated log files to keep.
+
+ Defaults to 20.
+
+ ## Hosting & Security {#hosting-and-security}
+
+ `PAPERLESS_SECRET_KEY=<key>`
+
+ : Paperless uses this to make session tokens. If you expose paperless
+ on the internet, you need to change this, since the default secret
+ is well known.
+
+ Use any sequence of characters. The more, the better. You don't
+ need to remember this. Just face-roll your keyboard.
+
+ Default is listed in the file `src/paperless/settings.py`.
+
+ `PAPERLESS_URL=<url>`
+
+ : This setting can be used to set the three options below
+ (ALLOWED_HOSTS, CORS_ALLOWED_HOSTS and CSRF_TRUSTED_ORIGINS). If the
+ other options are set the values will be combined with this one. Do
+ not include a trailing slash. E.g. <https://paperless.domain.com>
+
+ Defaults to empty string, leaving the other settings unaffected.
+
+ `PAPERLESS_CSRF_TRUSTED_ORIGINS=<comma-separated-list>`
+
+ : A list of trusted origins for unsafe requests (e.g. POST). As of
+ Django 4.0 this is required to access the Django admin via the web.
+ See
+ <https://docs.djangoproject.com/en/4.0/ref/settings/#csrf-trusted-origins>
+
+ Can also be set using PAPERLESS_URL (see above).
+
+ Defaults to empty string, which does not add any origins to the
+ trusted list.
+
+ `PAPERLESS_ALLOWED_HOSTS=<comma-separated-list>`
+
+ : If you're planning on putting Paperless on the open internet, then
+ you really should set this value to the domain name you're using.
+ Failing to do so leaves you open to HTTP host header attacks:
+ <https://docs.djangoproject.com/en/3.1/topics/security/#host-header-validation>
+
+ Just remember that this is a comma-separated list, so
+ "example.com" is fine, as is "example.com,www.example.com", but
+ NOT " example.com" or "example.com,"
+
+ Can also be set using PAPERLESS_URL (see above).
+
+ If manually set, please remember to include "localhost". Otherwise
+ docker healthcheck will fail.
+
+ Defaults to "\*", which is all hosts.
+
+ `PAPERLESS_CORS_ALLOWED_HOSTS=<comma-separated-list>`
+
+ : You need to add your servers to the list of allowed hosts that can
+ do CORS calls. Set this to your public domain name.
+
+ Can also be set using PAPERLESS_URL (see above).
+
+ Defaults to "<http://localhost:8000>".
+
+ `PAPERLESS_FORCE_SCRIPT_NAME=<path>`
+
+ : To host paperless under a subpath url like example.com/paperless you
+ set this value to /paperless. No trailing slash!
+
+ Defaults to none, which hosts paperless at "/".
+
+ `PAPERLESS_STATIC_URL=<path>`
+
+ : Override the STATIC_URL here. Unless you're hosting Paperless off a
+ subdomain like /paperless/, you probably don't need to change this.
+ If you do change it, be sure to include the trailing slash.
+
+ Defaults to "/static/".
+
+ !!! note
+
+ When hosting paperless behind a reverse proxy like Traefik or Nginx
+ at a subpath e.g. example.com/paperlessngx you will also need to set
+ `PAPERLESS_FORCE_SCRIPT_NAME` (see above).
+
+ `PAPERLESS_AUTO_LOGIN_USERNAME=<username>`
+
+ : Specify a username here so that paperless will automatically perform
+ login with the selected user.
+
+ !!! danger
+
+ Do not use this when exposing paperless on the internet. There are
+ no checks in place that would prevent you from doing this.
+
+ Defaults to none, which disables this feature.
+
+ `PAPERLESS_ADMIN_USER=<username>`
+
+ : If this environment variable is specified, Paperless automatically
+ creates a superuser with the provided username at start. This is
+ useful in cases where you can not run the
+ [createsuperuser]{.title-ref} command separately, such as Kubernetes
+ or AWS ECS.
+
+ Requires [PAPERLESS_ADMIN_PASSWORD]{.title-ref} to be set.
+
+ !!! note
+
+ This will not change an existing \[super\]user's password, nor will
+ it recreate a user that already exists. You can leave this
+ throughout the lifecycle of the containers.
+
+ `PAPERLESS_ADMIN_MAIL=<email>`
+
+ : (Optional) Specify superuser email address. Only used when
+ [PAPERLESS_ADMIN_USER]{.title-ref} is set.
+
+ Defaults to `root@localhost`.
+
+ `PAPERLESS_ADMIN_PASSWORD=<password>`
+
+ : Only used when [PAPERLESS_ADMIN_USER]{.title-ref} is set. This will
+ be the password of the automatically created superuser.
+
+ `PAPERLESS_COOKIE_PREFIX=<str>`
+
+ : Specify a prefix that is added to the cookies used by paperless to
+ identify the currently logged in user. This is useful for when
+ you're running two instances of paperless on the same host.
+
+ After changing this, you will have to login again.
+
+ Defaults to `""`, which does not alter the cookie names.
+
+ `PAPERLESS_ENABLE_HTTP_REMOTE_USER=<bool>`
+
+ : Allows authentication via HTTP_REMOTE_USER which is used by some SSO
+ applications.
+
+ !!! warning
+
+ This will allow authentication by simply adding a
+ `Remote-User: <username>` header to a request. Use with care! You
+ especially *must: ensure that any such header is not passed from
+ your proxy server to paperless.
+
+ If you're exposing paperless to the internet directly, do not use
+ this.
+
+ Also see the warning [in the official documentation
+ <https://docs.djangoproject.com/en/3.1/howto/auth-remote-user/#configuration>]{.title-ref}.
+
+ Defaults to [false]{.title-ref} which disables this feature.
+
+ `PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME=<str>`
+
+ : If [PAPERLESS_ENABLE_HTTP_REMOTE_USER]{.title-ref} is enabled, this
+ property allows to customize the name of the HTTP header from which
+ the authenticated username is extracted. Values are in terms of
+ \[HttpRequest.META\](<https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.META>).
+ Thus, the configured value must start with [HTTP\_]{.title-ref}
+ followed by the normalized actual header name.
+
+ Defaults to [HTTP_REMOTE_USER]{.title-ref}.
+
+ `PAPERLESS_LOGOUT_REDIRECT_URL=<str>`
+
+ : URL to redirect the user to after a logout. This can be used
+ together with [PAPERLESS_ENABLE_HTTP_REMOTE_USER]{.title-ref} to
+ redirect the user back to the SSO application's logout page.
+
+ Defaults to None, which disables this feature.
+
+ ## OCR settings {#ocr}
+
+ Paperless uses [OCRmyPDF](https://ocrmypdf.readthedocs.io/en/latest/)
+ for performing OCR on documents and images. Paperless uses sensible
+ defaults for most settings, but all of them can be configured to your
+ needs.
+
+ `PAPERLESS_OCR_LANGUAGE=<lang>`
+
+ : Customize the language that paperless will attempt to use when
+ parsing documents.
+
+ It should be a 3-letter language code consistent with ISO 639:
+ <https://www.loc.gov/standards/iso639-2/php/code_list.php>
+
+ Set this to the language most of your documents are written in.
+
+ This can be a combination of multiple languages such as `deu+eng`,
+ in which case tesseract will use whatever language matches best.
+ Keep in mind that tesseract uses much more cpu time with multiple
+ languages enabled.
+
+ Defaults to "eng".
+
+ !!! note
+
+ If your language contains a '-' such as chi-sim, you must use chi_sim
+
+ `PAPERLESS_OCR_MODE=<mode>`
+
+ : Tell paperless when and how to perform ocr on your documents. Four
+ modes are available:
+
+ - `skip`: Paperless skips all pages and will perform ocr only on
+ pages where no text is present. This is the safest option.
+
+ - `skip_noarchive`: In addition to skip, paperless won't create
+ an archived version of your documents when it finds any text in
+ them. This is useful if you don't want to have two
+ almost-identical versions of your digital documents in the media
+ folder. This is the fastest option.
+
+ - `redo`: Paperless will OCR all pages of your documents and
+ attempt to replace any existing text layers with new text. This
+ will be useful for documents from scanners that already
+ performed OCR with insufficient results. It will also perform
+ OCR on purely digital documents.
+
+ This option may fail on some documents that have features that
+ cannot be removed, such as forms. In this case, the text from
+ the document is used instead.
+
+ - `force`: Paperless rasterizes your documents, converting any
+ text into images and puts the OCRed text on top. This works for
+ all documents, however, the resulting document may be
+ significantly larger and text won't appear as sharp when zoomed
+ in.
+
+ The default is `skip`, which only performs OCR when necessary and
+ always creates archived documents.
+
+ Read more about this in the [OCRmyPDF
+ documentation](https://ocrmypdf.readthedocs.io/en/latest/advanced.html#when-ocr-is-skipped).
+
+ `PAPERLESS_OCR_CLEAN=<mode>`
+
+ : Tells paperless to use `unpaper` to clean any input document before
+ sending it to tesseract. This uses more resources, but generally
+ results in better OCR results. The following modes are available:
+
+ - `clean`: Apply unpaper.
+ - `clean-final`: Apply unpaper, and use the cleaned images to
+ build the output file instead of the original images.
+ - `none`: Do not apply unpaper.
+
+ Defaults to `clean`.
+
+ !!! note
+
+ `clean-final` is incompatible with ocr mode `redo`. When both
+ `clean-final` and the ocr mode `redo` is configured, `clean` is used
+ instead.
+
+ `PAPERLESS_OCR_DESKEW=<bool>`
+
+ : Tells paperless to correct skewing (slight rotation of input images
+ mainly due to improper scanning)
+
+ Defaults to `true`, which enables this feature.
+
+ !!! note
+
+ Deskewing is incompatible with ocr mode `redo`. Deskewing will get
+ disabled automatically if `redo` is used as the ocr mode.
+
+ `PAPERLESS_OCR_ROTATE_PAGES=<bool>`
+
+ : Tells paperless to correct page rotation (90°, 180° and 270°
+ rotation).
+
+ If you notice that paperless is not rotating incorrectly rotated
+ pages (or vice versa), try adjusting the threshold up or down (see
+ below).
+
+ Defaults to `true`, which enables this feature.
+
+ `PAPERLESS_OCR_ROTATE_PAGES_THRESHOLD=<num>`
+
+ : Adjust the threshold for automatic page rotation by
+ `PAPERLESS_OCR_ROTATE_PAGES`. This is an arbitrary value reported by
+ tesseract. "15" is a very conservative value, whereas "2" is a
+ very aggressive option and will often result in correctly rotated
+ pages being rotated as well.
+
+ Defaults to "12".
+
+ `PAPERLESS_OCR_OUTPUT_TYPE=<type>`
+
+ : Specify the the type of PDF documents that paperless should produce.
+
+ - `pdf`: Modify the PDF document as little as possible.
+ - `pdfa`: Convert PDF documents into PDF/A-2b documents, which is
+ a subset of the entire PDF specification and meant for storing
+ documents long term.
+ - `pdfa-1`, `pdfa-2`, `pdfa-3` to specify the exact version of
+ PDF/A you wish to use.
+
+ If not specified, `pdfa` is used. Remember that paperless also keeps
+ the original input file as well as the archived version.
+
+ `PAPERLESS_OCR_PAGES=<num>`
+
+ : Tells paperless to use only the specified amount of pages for OCR.
+ Documents with less than the specified amount of pages get OCR'ed
+ completely.
+
+ Specifying 1 here will only use the first page.
+
+ When combined with `PAPERLESS_OCR_MODE=redo` or
+ `PAPERLESS_OCR_MODE=force`, paperless will not modify any text it
+ finds on excluded pages and copy it verbatim.
+
+ Defaults to 0, which disables this feature and always uses all
+ pages.
+
+ `PAPERLESS_OCR_IMAGE_DPI=<num>`
+
+ : Paperless will OCR any images you put into the system and convert
+ them into PDF documents. This is useful if your scanner produces
+ images. In order to do so, paperless needs to know the DPI of the
+ image. Most images from scanners will have this information embedded
+ and paperless will detect and use that information. In case this
+ fails, it uses this value as a fallback.
+
+ Set this to the DPI your scanner produces images at.
+
+ Default is none, which will automatically calculate image DPI so
+ that the produced PDF documents are A4 sized.
+
+ `PAPERLESS_OCR_MAX_IMAGE_PIXELS=<num>`
+
+ : Paperless will raise a warning when OCRing images which are over
+ this limit and will not OCR images which are more than twice this
+ limit. Note this does not prevent the document from being consumed,
+ but could result in missing text content.
+
+ If unset, will default to the value determined by
+ [Pillow](https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.MAX_IMAGE_PIXELS).
+
+ !!! note
+
+ Increasing this limit could cause Paperless to consume additional
+ resources when consuming a file. Be sure you have sufficient system
+ resources.
+
+ !!! warning
+
+ The limit is intended to prevent malicious files from consuming
+ system resources and causing crashes and other errors. Only increase
+ this value if you are certain your documents are not malicious and
+ you need the text which was not OCRed
+
+ `PAPERLESS_OCR_USER_ARGS=<json>`
+
+ : OCRmyPDF offers many more options. Use this parameter to specify any
+ additional arguments you wish to pass to OCRmyPDF. Since Paperless
+ uses the API of OCRmyPDF, you have to specify these in a format that
+ can be passed to the API. See [the API reference of
+ OCRmyPDF](https://ocrmypdf.readthedocs.io/en/latest/api.html#reference)
+ for valid parameters. All command line options are supported, but
+ they use underscores instead of dashes.
+
+ !!! warning
+
+ Paperless has been tested to work with the OCR options provided
+ above. There are many options that are incompatible with each other,
+ so specifying invalid options may prevent paperless from consuming
+ any documents.
+
+ Specify arguments as a JSON dictionary. Keep note of lower case
+ booleans and double quoted parameter names and strings. Examples:
+
+ ``` json
+ {"deskew": true, "optimize": 3, "unpaper_args": "--pre-rotate 90"}
+ ```
+
+ ## Tika settings {#tika}
+
+ Paperless can make use of [Tika](https://tika.apache.org/) and
+ [Gotenberg](https://gotenberg.dev/) for parsing and converting
- gotenberg:
- image: gotenberg/gotenberg:7.6
- restart: unless-stopped
- command:
- - 'gotenberg'
- - '--chromium-disable-routes=true'
++"Office" documents (such as ".doc", ".xlsx" and ".odt").
++Tika and Gotenberg are also needed to allow parsing of E-Mails (.eml).
++
++If you wish to use this, you must provide a Tika server and a Gotenberg server,
+ configure their endpoints, and enable the feature.
+
+ `PAPERLESS_TIKA_ENABLED=<bool>`
+
+ : Enable (or disable) the Tika parser.
+
+ Defaults to false.
+
+ `PAPERLESS_TIKA_ENDPOINT=<url>`
+
+ : Set the endpoint URL were Paperless can reach your Tika server.
+
+ Defaults to "<http://localhost:9998>".
+
+ `PAPERLESS_TIKA_GOTENBERG_ENDPOINT=<url>`
+
+ : Set the endpoint URL were Paperless can reach your Gotenberg server.
+
+ Defaults to "<http://localhost:3000>".
+
+ If you run paperless on docker, you can add those services to the
+ docker-compose file (see the provided `docker-compose.sqlite-tika.yml`
+ file for reference). The changes requires are as follows:
+
+ ```yaml
+ services:
+ # ...
+
+ webserver:
+ # ...
+
+ environment:
+ # ...
+
+ PAPERLESS_TIKA_ENABLED: 1
+ PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://gotenberg:3000
+ PAPERLESS_TIKA_ENDPOINT: http://tika:9998
+
+ # ...
+
++ gotenberg:
++ image: gotenberg/gotenberg:7.6
++ restart: unless-stopped
++ # The gotenberg chromium route is used to convert .eml files. We do not
++ # want to allow external content like tracking pixels or even javascript.
++ command:
++ - "gotenberg"
++ - "--chromium-disable-javascript=true"
++ - "--chromium-allow-list=file:///tmp/.*"
+
+ tika:
+ image: ghcr.io/paperless-ngx/tika:latest
+ restart: unless-stopped
+ ```
+
+ Add the configuration variables to the environment of the webserver
+ (alternatively put the configuration in the `docker-compose.env` file)
+ and add the additional services below the webserver service. Watch out
+ for indentation.
+
+ Make sure to use the correct format [PAPERLESS_TIKA_ENABLED =
+ 1]{.title-ref} so python_dotenv can parse the statement correctly.
+
+ ## Software tweaks {#software_tweaks}
+
+ `PAPERLESS_TASK_WORKERS=<num>`
+
+ : Paperless does multiple things in the background: Maintain the
+ search index, maintain the automatic matching algorithm, check
+ emails, consume documents, etc. This variable specifies how many
+ things it will do in parallel.
+
+ Defaults to 1
+
+ `PAPERLESS_THREADS_PER_WORKER=<num>`
+
+ : Furthermore, paperless uses multiple threads when consuming
+ documents to speed up OCR. This variable specifies how many pages
+ paperless will process in parallel on a single document.
+
+ !!! warning
+
+ Ensure that the product
+
+ `PAPERLESS_TASK_WORKERS \: PAPERLESS_THREADS_PER_WORKER`
+
+ does not exceed your CPU core count or else paperless will be
+ extremely slow. If you want paperless to process many documents in
+ parallel, choose a high worker count. If you want paperless to
+ process very large documents faster, use a higher thread per worker
+ count.
+
+ The default is a balance between the two, according to your CPU core
+ count, with a slight favor towards threads per worker:
+
+ | CPU core count | Workers | Threads |
+ |----------------|---------|---------|
+ | > 1 | > 1 | > 1 |
+ | > 2 | > 2 | > 1 |
+ | > 4 | > 2 | > 2 |
+ | > 6 | > 2 | > 3 |
+ | > 8 | > 2 | > 4 |
+ | > 12 | > 3 | > 4 |
+ | > 16 | > 4 | > 4 |
+
+ If you only specify PAPERLESS_TASK_WORKERS, paperless will adjust
+ PAPERLESS_THREADS_PER_WORKER automatically.
+
+ `PAPERLESS_WORKER_TIMEOUT=<num>`
+
+ : Machines with few cores or weak ones might not be able to finish OCR
+ on large documents within the default 1800 seconds. So extending
+ this timeout may prove to be useful on weak hardware setups.
+
+ `PAPERLESS_WORKER_RETRY=<num>`
+
+ : If PAPERLESS_WORKER_TIMEOUT has been configured, the retry time for
+ a task can also be configured. By default, this value will be set to
+ 10s more than the worker timeout. This value should never be set
+ less than the worker timeout.
+
+ `PAPERLESS_TIME_ZONE=<timezone>`
+
+ : Set the time zone here. See
+ <https://docs.djangoproject.com/en/3.1/ref/settings/#std:setting-TIME_ZONE>
+ for details on how to set it.
+
+ Defaults to UTC.
+
+ ## Polling {#polling}
+
+ `PAPERLESS_CONSUMER_POLLING=<num>`
+
+ : If paperless won't find documents added to your consume folder, it
+ might not be able to automatically detect filesystem changes. In
+ that case, specify a polling interval in seconds here, which will
+ then cause paperless to periodically check your consumption
+ directory for changes. This will also disable listening for file
+ system changes with `inotify`.
+
+ Defaults to 0, which disables polling and uses filesystem
+ notifications.
+
+ `PAPERLESS_CONSUMER_POLLING_RETRY_COUNT=<num>`
+
+ : If consumer polling is enabled, sets the number of times paperless
+ will check for a file to remain unmodified.
+
+ Defaults to 5.
+
+ `PAPERLESS_CONSUMER_POLLING_DELAY=<num>`
+
+ : If consumer polling is enabled, sets the delay in seconds between
+ each check (above) paperless will do while waiting for a file to
+ remain unmodified.
+
+ Defaults to 5.
+
+ ## iNotify {#inotify}
+
+ `PAPERLESS_CONSUMER_INOTIFY_DELAY=<num>`
+
+ : Sets the time in seconds the consumer will wait for additional
+ events from inotify before the consumer will consider a file ready
+ and begin consumption. Certain scanners or network setups may
+ generate multiple events for a single file, leading to multiple
+ consumers working on the same file. Configure this to prevent that.
+
+ Defaults to 0.5 seconds.
+
+ `PAPERLESS_CONSUMER_DELETE_DUPLICATES=<bool>`
+
+ : When the consumer detects a duplicate document, it will not touch
+ the original document. This default behavior can be changed here.
+
+ Defaults to false.
+
+ `PAPERLESS_CONSUMER_RECURSIVE=<bool>`
+
+ : Enable recursive watching of the consumption directory. Paperless
+ will then pickup files from files in subdirectories within your
+ consumption directory as well.
+
+ Defaults to false.
+
+ `PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=<bool>`
+
+ : Set the names of subdirectories as tags for consumed files. E.g.
+ <CONSUMPTION_DIR>/foo/bar/file.pdf will add the tags "foo" and
+ "bar" to the consumed file. Paperless will create any tags that
+ don't exist yet.
+
+ This is useful for sorting documents with certain tags such as `car`
+ or `todo` prior to consumption. These folders won't be deleted.
+
+ PAPERLESS_CONSUMER_RECURSIVE must be enabled for this to work.
+
+ Defaults to false.
+
+ `PAPERLESS_CONSUMER_ENABLE_BARCODES=<bool>`
+
+ : Enables the scanning and page separation based on detected barcodes.
+ This allows for scanning and adding multiple documents per uploaded
+ file, which are separated by one or multiple barcode pages.
+
+ For ease of use, it is suggested to use a standardized separation
+ page, e.g. [here](https://www.alliancegroup.co.uk/patch-codes.htm).
+
+ If no barcodes are detected in the uploaded file, no page separation
+ will happen.
+
+ The original document will be removed and the separated pages will
+ be saved as pdf.
+
+ Defaults to false.
+
+ `PAPERLESS_CONSUMER_BARCODE_TIFF_SUPPORT=<bool>`
+
+ : Whether TIFF image files should be scanned for barcodes. This will
+ automatically convert any TIFF image(s) to pdfs for later
+ processing. This only has an effect, if
+ PAPERLESS_CONSUMER_ENABLE_BARCODES has been enabled.
+
+ Defaults to false.
+
+ PAPERLESS_CONSUMER_BARCODE_STRING=PATCHT
+
+ : Defines the string to be detected as a separator barcode. If
+ paperless is used with the PATCH-T separator pages, users shouldn't
+ change this.
+
+ Defaults to "PATCHT"
+
+ `PAPERLESS_CONVERT_MEMORY_LIMIT=<num>`
+
+ : On smaller systems, or even in the case of Very Large Documents, the
+ consumer may explode, complaining about how it's "unable to extend
+ pixel cache". In such cases, try setting this to a reasonably low
+ value, like 32. The default is to use whatever is necessary to do
+ everything without writing to disk, and units are in megabytes.
+
+ For more information on how to use this value, you should search the
+ web for "MAGICK_MEMORY_LIMIT".
+
+ Defaults to 0, which disables the limit.
+
+ `PAPERLESS_CONVERT_TMPDIR=<path>`
+
+ : Similar to the memory limit, if you've got a small system and your
+ OS mounts /tmp as tmpfs, you should set this to a path that's on a
+ physical disk, like /home/your_user/tmp or something. ImageMagick
+ will use this as scratch space when crunching through very large
+ documents.
+
+ For more information on how to use this value, you should search the
+ web for "MAGICK_TMPDIR".
+
+ Default is none, which disables the temporary directory.
+
+ `PAPERLESS_POST_CONSUME_SCRIPT=<filename>`
+
+ : After a document is consumed, Paperless can trigger an arbitrary
+ script if you like. This script will be passed a number of arguments
+ for you to work with. For more information, take a look at [Post-consumption script](advanced_usage#post_consume_script).
+
+ The default is blank, which means nothing will be executed.
+
+ `PAPERLESS_FILENAME_DATE_ORDER=<format>`
+
+ : Paperless will check the document text for document date
+ information. Use this setting to enable checking the document
+ filename for date information. The date order can be set to any
+ option as specified in
+ <https://dateparser.readthedocs.io/en/latest/settings.html#date-order>.
+ The filename will be checked first, and if nothing is found, the
+ document text will be checked as normal.
+
+ A date in a filename must have some separators ([.]{.title-ref},
+ [-]{.title-ref}, [/]{.title-ref}, etc) for it to be parsed.
+
+ Defaults to none, which disables this feature.
+
+ `PAPERLESS_NUMBER_OF_SUGGESTED_DATES=<num>`
+
+ : Paperless searches an entire document for dates. The first date
+ found will be used as the initial value for the created date. When
+ this variable is greater than 0 (or left to it's default value),
+ paperless will also suggest other dates found in the document, up to
+ a maximum of this setting. Note that duplicates will be removed,
+ which can result in fewer dates displayed in the frontend than this
+ setting value.
+
+ The task to find all dates can be time-consuming and increases with
+ a higher (maximum) number of suggested dates and slower hardware.
+
+ Defaults to 3. Set to 0 to disable this feature.
+
+ `PAPERLESS_THUMBNAIL_FONT_NAME=<filename>`
+
+ : Paperless creates thumbnails for plain text files by rendering the
+ content of the file on an image and uses a predefined font for that.
+ This font can be changed here.
+
+ Note that this won't have any effect on already generated
+ thumbnails.
+
+ Defaults to
+ `/usr/share/fonts/liberation/LiberationSerif-Regular.ttf`.
+
+ `PAPERLESS_IGNORE_DATES=<string>`
+
+ : Paperless parses a documents creation date from filename and file
+ content. You may specify a comma separated list of dates that should
+ be ignored during this process. This is useful for special dates
+ (like date of birth) that appear in documents regularly but are very
+ unlikely to be the documents creation date.
+
+ The date is parsed using the order specified in PAPERLESS_DATE_ORDER
+
+ Defaults to an empty string to not ignore any dates.
+
+ `PAPERLESS_DATE_ORDER=<format>`
+
+ : Paperless will try to determine the document creation date from its
+ contents. Specify the date format Paperless should expect to see
+ within your documents.
+
+ This option defaults to DMY which translates to day first, month
+ second, and year last order. Characters D, M, or Y can be shuffled
+ to meet the required order.
+
+ `PAPERLESS_CONSUMER_IGNORE_PATTERNS=<json>`
+
+ : By default, paperless ignores certain files and folders in the
+ consumption directory, such as system files created by the Mac OS.
+
+ This can be adjusted by configuring a custom json array with
+ patterns to exclude.
+
+ Defaults to
+ `[".DS_STORE/*", "._*", ".stfolder/*", ".stversions/*", ".localized/*", "desktop.ini"]`.
+
+ ## Binaries
+
+ There are a few external software packages that Paperless expects to
+ find on your system when it starts up. Unless you've done something
+ creative with their installation, you probably won't need to edit any
+ of these. However, if you've installed these programs somewhere where
+ simply typing the name of the program doesn't automatically execute it
+ (ie. the program isn't in your \$PATH), then you'll need to specify
+ the literal path for that program.
+
+ `PAPERLESS_CONVERT_BINARY=<path>`
+
+ : Defaults to "convert".
+
+ `PAPERLESS_GS_BINARY=<path>`
+
+ : Defaults to "gs".
+
+ ## Docker-specific options {#docker}
+
+ These options don't have any effect in `paperless.conf`. These options
+ adjust the behavior of the docker container. Configure these in
+ [docker-compose.env]{.title-ref}.
+
+ `PAPERLESS_WEBSERVER_WORKERS=<num>`
+
+ : The number of worker processes the webserver should spawn. More
+ worker processes usually result in the front end to load data much
+ quicker. However, each worker process also loads the entire
+ application into memory separately, so increasing this value will
+ increase RAM usage.
+
+ Defaults to 1.
+
+ `PAPERLESS_BIND_ADDR=<ip address>`
+
+ : The IP address the webserver will listen on inside the container.
+ There are special setups where you may need to configure this value
+ to restrict the Ip address or interface the webserver listens on.
+
+ Defaults to \[::\], meaning all interfaces, including IPv6.
+
+ `PAPERLESS_PORT=<port>`
+
+ : The port number the webserver will listen on inside the container.
+ There are special setups where you may need this to avoid collisions
+ with other services (like using podman with multiple containers in
+ one pod).
+
+ Don't change this when using Docker. To change the port the
+ webserver is reachable outside of the container, instead refer to
+ the "ports" key in `docker-compose.yml`.
+
+ Defaults to 8000.
+
+ `USERMAP_UID=<uid>`
+
+ : The ID of the paperless user in the container. Set this to your
+ actual user ID on the host system, which you can get by executing
+
+ ``` shell-session
+ $ id -u
+ ```
+
+ Paperless will change ownership on its folders to this user, so you
+ need to get this right in order to be able to write to the
+ consumption directory.
+
+ Defaults to 1000.
+
+ `USERMAP_GID=<gid>`
+
+ : The ID of the paperless Group in the container. Set this to your
+ actual group ID on the host system, which you can get by executing
+
+ ``` shell-session
+ $ id -g
+ ```
+
+ Paperless will change ownership on its folders to this group, so you
+ need to get this right in order to be able to write to the
+ consumption directory.
+
+ Defaults to 1000.
+
+ `PAPERLESS_OCR_LANGUAGES=<list>`
+
+ : Additional OCR languages to install. By default, paperless comes
+ with English, German, Italian, Spanish and French. If your language
+ is not in this list, install additional languages with this
+ configuration option:
+
+ ``` bash
+ PAPERLESS_OCR_LANGUAGES=tur ces
+ ```
+
+ To actually use these languages, also set the default OCR language
+ of paperless:
+
+ ``` bash
+ PAPERLESS_OCR_LANGUAGE=tur
+ ```
+
+ Defaults to none, which does not install any additional languages.
+
+ `PAPERLESS_ENABLE_FLOWER=<defined>`
+
+ : If this environment variable is defined, the Celery monitoring tool
+ [Flower](https://flower.readthedocs.io/en/latest/index.html) will be
+ started by the container.
+
+ You can read more about this in the [advanced documentation](advanced#celery-monitoring).
+
+ ## Update Checking {#update-checking}
+
+ `PAPERLESS_ENABLE_UPDATE_CHECK=<bool>`
+
+ !!! note
+
+ This setting was deprecated in favor of a frontend setting after
+ v1.9.2. A one-time migration is performed for users who have this
+ setting set. This setting is always ignored if the corresponding
+ frontend setting has been set.
--- /dev/null
-gotenberg:
- image: gotenberg/gotenberg:7.6
- restart: unless-stopped
+ # Troubleshooting
+
+ ## No files are added by the consumer
+
+ Check for the following issues:
+
+ - Ensure that the directory you're putting your documents in is the
+ folder paperless is watching. With docker, this setting is performed
+ in the `docker-compose.yml` file. Without docker, look at the
+ `CONSUMPTION_DIR` setting. Don't adjust this setting if you're
+ using docker.
+
+ - Ensure that redis is up and running. Paperless does its task
+ processing asynchronously, and for documents to arrive at the task
+ processor, it needs redis to run.
+
+ - Ensure that the task processor is running. Docker does this
+ automatically. Manually invoke the task processor by executing
+
+ ```shell-session
+ $ celery --app paperless worker
+ ```
+
+ - Look at the output of paperless and inspect it for any errors.
+
+ - Go to the admin interface, and check if there are failed tasks. If
+ so, the tasks will contain an error message.
+
+ ## Consumer warns `OCR for XX failed`
+
+ If you find the OCR accuracy to be too low, and/or the document consumer
+ warns that
+ `OCR for XX failed, but we're going to stick with what we've got since FORGIVING_OCR is enabled`,
+ then you might need to install the [Tesseract language
+ files](http://packages.ubuntu.com/search?keywords=tesseract-ocr)
+ marching your document's languages.
+
+ As an example, if you are running Paperless-ngx from any Ubuntu or
+ Debian box, and your documents are written in Spanish you may need to
+ run:
+
+ apt-get install -y tesseract-ocr-spa
+
+ ## Consumer fails to pickup any new files
+
+ If you notice that the consumer will only pickup files in the
+ consumption directory at startup, but won't find any other files added
+ later, you will need to enable filesystem polling with the configuration
+ option `PAPERLESS_CONSUMER_POLLING`, see
+ `[here](/configuration#polling).
+
+ This will disable listening to filesystem changes with inotify and
+ paperless will manually check the consumption directory for changes
+ instead.
+
+ ## Paperless always redirects to /admin
+
+ You probably had the old paperless installed at some point. Paperless
+ installed a permanent redirect to /admin in your browser, and you need
+ to clear your browsing data / cache to fix that.
+
+ ## Operation not permitted
+
+ You might see errors such as:
+
+ ```shell-session
+ chown: changing ownership of '../export': Operation not permitted
+ ```
+
+ The container tries to set file ownership on the listed directories.
+ This is required so that the user running paperless inside docker has
+ write permissions to these folders. This happens when pointing these
+ directories to NFS shares, for example.
+
+ Ensure that `chown` is possible on these directories.
+
+ ## Classifier error: No training data available
+
+ This indicates that the Auto matching algorithm found no documents to
+ learn from. This may have two reasons:
+
+ - You don't use the Auto matching algorithm: The error can be safely
+ ignored in this case.
+ - You are using the Auto matching algorithm: The classifier explicitly
+ excludes documents with Inbox tags. Verify that there are documents
+ in your archive without inbox tags. The algorithm will only learn
+ from documents not in your inbox.
+
+ ## UserWarning in sklearn on every single document
+
+ You may encounter warnings like this:
+
+ ```
+ /usr/local/lib/python3.7/site-packages/sklearn/base.py:315:
+ UserWarning: Trying to unpickle estimator CountVectorizer from version 0.23.2 when using version 0.24.0.
+ This might lead to breaking code or invalid results. Use at your own risk.
+ ```
+
+ This happens when certain dependencies of paperless that are responsible
+ for the auto matching algorithm are updated. After updating these, your
+ current training data _might_ not be compatible anymore. This can be
+ ignored in most cases. This warning will disappear automatically when
+ paperless updates the training data.
+
+ If you want to get rid of the warning or actually experience issues with
+ automatic matching, delete the file `classification_model.pickle` in the
+ data directory and let paperless recreate it.
+
+ ## 504 Server Error: Gateway Timeout when adding Office documents
+
+ You may experience these errors when using the optional TIKA
+ integration:
+
+ ```
+ requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: http://gotenberg:3000/forms/libreoffice/convert
+ ```
+
+ Gotenberg is a server that converts Office documents into PDF documents
+ and has a default timeout of 30 seconds. When conversion takes longer,
+ Gotenberg raises this error.
+
+ You can increase the timeout by configuring a command flag for Gotenberg
+ (see also [here](https://gotenberg.dev/docs/modules/api#properties)). If
+ using docker-compose, this is achieved by the following configuration
+ change in the `docker-compose.yml` file:
+
+ ```yaml
- - 'gotenberg'
- - '--chromium-disable-routes=true'
- - '--api-timeout=60'
++ # The gotenberg chromium route is used to convert .eml files. We do not
++ # want to allow external content like tracking pixels or even javascript.
+ command:
++ - "gotenberg"
++ - "--chromium-disable-javascript=true"
++ - "--chromium-allow-list=file:///tmp/.*"
++ - "--api-timeout=60"
+ ```
+
+ ## Permission denied errors in the consumption directory
+
+ You might encounter errors such as:
+
+ ```shell-session
+ The following error occured while consuming document.pdf: [Errno 13] Permission denied: '/usr/src/paperless/src/../consume/document.pdf'
+ ```
+
+ This happens when paperless does not have permission to delete files
+ inside the consumption directory. Ensure that `USERMAP_UID` and
+ `USERMAP_GID` are set to the user id and group id you use on the host
+ operating system, if these are different from `1000`. See [Docker setup](setup#docker_hub).
+
+ Also ensure that you are able to read and write to the consumption
+ directory on the host.
+
+ ## OSError: \[Errno 19\] No such device when consuming files
+
+ If you experience errors such as:
+
+ ```shell-session
+ File "/usr/local/lib/python3.7/site-packages/whoosh/codec/base.py", line 570, in open_compound_file
+ return CompoundStorage(dbfile, use_mmap=storage.supports_mmap)
+ File "/usr/local/lib/python3.7/site-packages/whoosh/filedb/compound.py", line 75, in __init__
+ self._source = mmap.mmap(fileno, 0, access=mmap.ACCESS_READ)
+ OSError: [Errno 19] No such device
+
+ During handling of the above exception, another exception occurred:
+
+ Traceback (most recent call last):
+ File "/usr/local/lib/python3.7/site-packages/django_q/cluster.py", line 436, in worker
+ res = f(*task["args"], **task["kwargs"])
+ File "/usr/src/paperless/src/documents/tasks.py", line 73, in consume_file
+ override_tag_ids=override_tag_ids)
+ File "/usr/src/paperless/src/documents/consumer.py", line 271, in try_consume_file
+ raise ConsumerError(e)
+ ```
+
+ Paperless uses a search index to provide better and faster full text
+ searching. This search index is stored inside the `data` folder. The
+ search index uses memory-mapped files (mmap). The above error indicates
+ that paperless was unable to create and open these files.
+
+ This happens when you're trying to store the data directory on certain
+ file systems (mostly network shares) that don't support memory-mapped
+ files.
+
+ ## Web-UI stuck at "Loading\..."
+
+ This might have multiple reasons.
+
+ 1. If you built the docker image yourself or deployed using the bare
+ metal route, make sure that there are files in
+ `<paperless-root>/static/frontend/<lang-code>/`. If there are no
+ files, make sure that you executed `collectstatic` successfully,
+ either manually or as part of the docker image build.
+
+ If the front end is still missing, make sure that the front end is
+ compiled (files present in `src/documents/static/frontend`). If it
+ is not, you need to compile the front end yourself or download the
+ release archive instead of cloning the repository.
+
+ 2. Check the output of the web server. You might see errors like this:
+
+ ```
+ [2021-01-25 10:08:04 +0000] [40] [ERROR] Socket error processing request.
+ Traceback (most recent call last):
+ File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 134, in handle
+ self.handle_request(listener, req, client, addr)
+ File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 190, in handle_request
+ util.reraise(*sys.exc_info())
+ File "/usr/local/lib/python3.7/site-packages/gunicorn/util.py", line 625, in reraise
+ raise value
+ File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 178, in handle_request
+ resp.write_file(respiter)
+ File "/usr/local/lib/python3.7/site-packages/gunicorn/http/wsgi.py", line 396, in write_file
+ if not self.sendfile(respiter):
+ File "/usr/local/lib/python3.7/site-packages/gunicorn/http/wsgi.py", line 386, in sendfile
+ sent += os.sendfile(sockno, fileno, offset + sent, count)
+ OSError: [Errno 22] Invalid argument
+ ```
+
+ To fix this issue, add
+
+ ```
+ SENDFILE=0
+ ```
+
+ to your [docker-compose.env]{.title-ref} file.
+
+ ## Error while reading metadata
+
+ You might find messages like these in your log files:
+
+ ```
+ [WARNING] [paperless.parsing.tesseract] Error while reading metadata
+ ```
+
+ This indicates that paperless failed to read PDF metadata from one of
+ your documents. This happens when you open the affected documents in
+ paperless for editing. Paperless will continue to work, and will simply
+ not show the invalid metadata.
+
+ ## Consumer fails with a FileNotFoundError
+
+ You might find messages like these in your log files:
+
+ ```
+ [ERROR] [paperless.consumer] Error while consuming document SCN_0001.pdf: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/ocrmypdf.io.yhk3zbv0/origin.pdf'
+ Traceback (most recent call last):
+ File "/app/paperless/src/paperless_tesseract/parsers.py", line 261, in parse
+ ocrmypdf.ocr(**args)
+ File "/usr/local/lib/python3.8/dist-packages/ocrmypdf/api.py", line 337, in ocr
+ return run_pipeline(options=options, plugin_manager=plugin_manager, api=True)
+ File "/usr/local/lib/python3.8/dist-packages/ocrmypdf/_sync.py", line 385, in run_pipeline
+ exec_concurrent(context, executor)
+ File "/usr/local/lib/python3.8/dist-packages/ocrmypdf/_sync.py", line 302, in exec_concurrent
+ pdf = post_process(pdf, context, executor)
+ File "/usr/local/lib/python3.8/dist-packages/ocrmypdf/_sync.py", line 235, in post_process
+ pdf_out = metadata_fixup(pdf_out, context)
+ File "/usr/local/lib/python3.8/dist-packages/ocrmypdf/_pipeline.py", line 798, in metadata_fixup
+ with pikepdf.open(context.origin) as original, pikepdf.open(working_file) as pdf:
+ File "/usr/local/lib/python3.8/dist-packages/pikepdf/_methods.py", line 923, in open
+ pdf = Pdf._open(
+ FileNotFoundError: [Errno 2] No such file or directory: '/tmp/ocrmypdf.io.yhk3zbv0/origin.pdf'
+ ```
+
+ This probably indicates paperless tried to consume the same file twice.
+ This can happen for a number of reasons, depending on how documents are
+ placed into the consume folder. If paperless is using inotify (the
+ default) to check for documents, try adjusting the
+ [inotify configuration](/configuration#inotify). If polling is enabled, try adjusting the
+ [polling configuration](/configuration#polling).
+
+ ## Consumer fails waiting for file to remain unmodified.
+
+ You might find messages like these in your log files:
+
+ ```
+ [ERROR] [paperless.management.consumer] Timeout while waiting on file /usr/src/paperless/src/../consume/SCN_0001.pdf to remain unmodified.
+ ```
+
+ This indicates paperless timed out while waiting for the file to be
+ completely written to the consume folder. Adjusting
+ [polling configuration](/configuration#polling) values should resolve the issue.
+
+ !!! note
+
+ The user will need to manually move the file out of the consume folder
+ and back in, for the initial failing file to be consumed.
+
+ ## Consumer fails reporting "OS reports file as busy still".
+
+ You might find messages like these in your log files:
+
+ ```
+ [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/src/../consume/SCN_0001.pdf: OS reports file as busy still
+ ```
+
+ This indicates paperless was unable to open the file, as the OS reported
+ the file as still being in use. To prevent a crash, paperless did not
+ try to consume the file. If paperless is using inotify (the default) to
+ check for documents, try adjusting the
+ [inotify configuration](/configuration#inotify). If polling is enabled, try adjusting the
+ [polling configuration](/configuration#polling).
+
+ !!! note
+
+ The user will need to manually move the file out of the consume folder
+ and back in, for the initial failing file to be consumed.
+
+ ## Log reports "Creating PaperlessTask failed".
+
+ You might find messages like these in your log files:
+
+ ```
+ [ERROR] [paperless.management.consumer] Creating PaperlessTask failed: db locked
+ ```
+
+ You are likely using an sqlite based installation, with an increased
+ number of workers and are running into sqlite's concurrency
+ limitations. Uploading or consuming multiple files at once results in
+ many workers attempting to access the database simultaneously.
+
+ Consider changing to the PostgreSQL database if you will be processing
+ many documents at once often. Otherwise, try tweaking the
+ `PAPERLESS_DB_TIMEOUT` setting to allow more time for the database to
+ unlock. This may have minor performance implications.
+
+ ## gunicorn fails to start with "is not a valid port number"
+
+ You are likely running using Kubernetes, which automatically creates an
+ environment variable named [\${serviceName}\_PORT]{.title-ref}. This is
+ the same environment variable which is used by Paperless to optionally
+ change the port gunicorn listens on.
+
+ To fix this, set [PAPERLESS_PORT]{.title-ref} again to your desired
+ port, or the default of 8000.