]> git.ipfire.org Git - thirdparty/paperless-ngx.git/commitdiff
Add info re tesseract language codes
authorshamoon <4887959+shamoon@users.noreply.github.com>
Mon, 10 Apr 2023 21:04:30 +0000 (14:04 -0700)
committershamoon <4887959+shamoon@users.noreply.github.com>
Mon, 10 Apr 2023 21:04:30 +0000 (14:04 -0700)
Closes #3065

docs/configuration.md

index 046904eaf4923b8032d46bd0c84735d3a4309ff7..aca9961e28e3b7cf59e45880edef8747350921c2 100644 (file)
@@ -1088,10 +1088,13 @@ actual group ID on the host system, which you can get by executing
 : Additional OCR languages to install. By default, paperless comes
 with English, German, Italian, Spanish and French. If your language
 is not in this list, install additional languages with this
-configuration option ([find the right LangCodes](https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html)):
+configuration option. You will need to [find the right LangCodes](https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html)
+but note that (tesseract-ocr-\* package names)[https://packages.debian.org/bullseye/graphics/]
+do not always correspond with the language codes e.g. "chi_tra" should be
+specified as "chi-tra".
 
     ``` bash
-    PAPERLESS_OCR_LANGUAGES=tur ces
+    PAPERLESS_OCR_LANGUAGES=tur ces chi-tra
     ```
 
     Make sure it's a space separated list when using several values.