Difference between revisions of "Pdftotext"
Jump to navigation
Jump to search
(Created page with " {{PDF}}") |
|||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
+ | {{lc}} | ||
+ | pdftotxt file.pdf | ||
+ | pdftotxt -htmlmeta file.pdf | ||
− | {{PDF}} | + | |
+ | <pre> | ||
+ | pdftotext --help | ||
+ | pdftotext version 22.05.0 | ||
+ | Copyright 2005-2022 The Poppler Developers - http://poppler.freedesktop.org | ||
+ | Copyright 1996-2011, 2022 Glyph & Cog, LLC | ||
+ | Usage: pdftotext [options] <PDF-file> [<text-file>] | ||
+ | -f <int> : first page to convert | ||
+ | -l <int> : last page to convert | ||
+ | -r <fp> : resolution, in DPI (default is 72) | ||
+ | -x <int> : x-coordinate of the crop area top left corner | ||
+ | -y <int> : y-coordinate of the crop area top left corner | ||
+ | -W <int> : width of crop area in pixels (default is 0) | ||
+ | -H <int> : height of crop area in pixels (default is 0) | ||
+ | -layout : maintain original physical layout | ||
+ | -fixed <fp> : assume fixed-pitch (or tabular) text | ||
+ | -raw : keep strings in content stream order | ||
+ | -nodiag : discard diagonal text | ||
+ | -htmlmeta : generate a simple HTML file, including the meta information | ||
+ | -tsv : generate a simple TSV file, including the meta information for bounding boxes | ||
+ | -enc <string> : output text encoding name | ||
+ | -listenc : list available encodings | ||
+ | -eol <string> : output end-of-line convention (unix, dos, or mac) | ||
+ | -nopgbrk : don't insert page breaks between pages | ||
+ | -bbox : output bounding box for each word and page size to html. Sets -htmlmeta | ||
+ | -bbox-layout : like -bbox but with extra layout bounding box data. Sets -htmlmeta | ||
+ | -cropbox : use the crop box rather than media box | ||
+ | -colspacing <fp> : how much spacing we allow after a word before considering adjacent text to be a new column, as a fraction of the font size (default is 0.7, old releases had a 0.3 default) | ||
+ | -opw <string> : owner password (for encrypted files) | ||
+ | -upw <string> : user password (for encrypted files) | ||
+ | -q : don't print any messages or errors | ||
+ | -v : print copyright and version info | ||
+ | -h : print usage information | ||
+ | -help : print usage information | ||
+ | --help : print usage information | ||
+ | -? : print usage information | ||
+ | </pre> | ||
+ | |||
+ | == See also == | ||
+ | * {{PDF}} | ||
+ | |||
+ | [[Category:PDF]] |
Latest revision as of 12:07, 24 May 2022
pdftotxt file.pdf pdftotxt -htmlmeta file.pdf
pdftotext --help pdftotext version 22.05.0 Copyright 2005-2022 The Poppler Developers - http://poppler.freedesktop.org Copyright 1996-2011, 2022 Glyph & Cog, LLC Usage: pdftotext [options] <PDF-file> [<text-file>] -f <int> : first page to convert -l <int> : last page to convert -r <fp> : resolution, in DPI (default is 72) -x <int> : x-coordinate of the crop area top left corner -y <int> : y-coordinate of the crop area top left corner -W <int> : width of crop area in pixels (default is 0) -H <int> : height of crop area in pixels (default is 0) -layout : maintain original physical layout -fixed <fp> : assume fixed-pitch (or tabular) text -raw : keep strings in content stream order -nodiag : discard diagonal text -htmlmeta : generate a simple HTML file, including the meta information -tsv : generate a simple TSV file, including the meta information for bounding boxes -enc <string> : output text encoding name -listenc : list available encodings -eol <string> : output end-of-line convention (unix, dos, or mac) -nopgbrk : don't insert page breaks between pages -bbox : output bounding box for each word and page size to html. Sets -htmlmeta -bbox-layout : like -bbox but with extra layout bounding box data. Sets -htmlmeta -cropbox : use the crop box rather than media box -colspacing <fp> : how much spacing we allow after a word before considering adjacent text to be a new column, as a fraction of the font size (default is 0.7, old releases had a 0.3 default) -opw <string> : owner password (for encrypted files) -upw <string> : user password (for encrypted files) -q : don't print any messages or errors -v : print copyright and version info -h : print usage information -help : print usage information --help : print usage information -? : print usage information
See also[edit]
Advertising: