【libreoffice】libreoffice實現office轉pdf、html、jpg等格式數據

  其實libreoffice有好多功能,徹底能夠替代office

 

1.windows下將word轉爲pdf

1  安裝libreoffice

到官網下載後安裝便可。https://donate.libreoffice.org/html

安裝完成後目錄:java

 

其實安裝完咱們發現其有好多功能,如今介紹幾個重要的功能。node

soffice.exe --- 相似於一個全收錄功能,雙擊能夠新建好多格式文本。linux

sweb.exe---相似於一個html的編輯器,能夠編輯好多文件,可能與notpad++更像。git

scalc.exe---相似於excel,對錶格處理。github

simpress.exe---相似於pptweb

swriter.exe---相似於word,編輯文檔(固然能夠打開docx文檔)數據庫

sbase.exe----對數據庫進行操做,能夠經過JDBC、ODBC鏈接數據庫,沒有可視化工具的時候能夠用這個。bootstrap

 

2.配置環境變量(爲了咱們能在任何狀況下調用命令)

 

 

執行命令測試soffice

C:\Users\liqiang>
LibreOffice 6.0.6.2 0c292870b25a325b5ed35f6b45599d2ea4458e77

Usage: soffice [argument...]
       argument - switches, switch parameters and document URIs (filenames).

Using without special arguments:
Opens the start center, if it is used without any arguments.
   {file}              Tries to open the file (files) in the components
                       suitable for them.
   {file} {macro:///Library.Module.MacroName}
                       Opens the file and runs specified macros from
                       the file.

Getting help and information:
   --help | -h | -?    Shows this help and quits.
   --helpwriter        Opens built-in or online Help on Writer.
   --helpcalc          Opens built-in or online Help on Calc.
   --helpdraw          Opens built-in or online Help on Draw.
   --helpimpress       Opens built-in or online Help on Impress.
   --helpbase          Opens built-in or online Help on Base.
   --helpbasic         Opens built-in or online Help on Basic scripting
                       language.
   --helpmath          Opens built-in or online Help on Math.
   --version           Shows the version and quits.
   --nstemporarydirectory
                       (MacOS X sandbox only) Returns path of the temporary
                       directory for the current user and exits. Overrides
                       all other arguments.

General arguments:
   --quickstart[=no]   Activates[Deactivates] the Quickstarter service.
   --nolockcheck       Disables check for remote instances using one
                       installation.
   --infilter={filter} Force an input filter type if possible. For example:
                       --infilter="Calc Office Open XML"
                       --infilter="Text (encoded):UTF8,LF,,,"
   --pidfile={file}    Store soffice.bin pid to {file}.
   --display {display} Sets the DISPLAY environment variable on UNIX-like
                       platforms to the value {display} (only supported by a
                       start script).

User/programmatic interface control:
   --nologo            Disables the splash screen at program start.
   --minimized         Starts minimized. The splash screen is not displayed.
   --nodefault         Starts without displaying anything except the splash
                       screen (do not display initial window).
   --invisible         Starts in invisible mode. Neither the start-up logo nor
                       the initial program window will be visible. Application
                       can be controlled, and documents and dialogs can be
                       controlled and opened via the API. Using the parameter,
                       the process can only be ended using the taskmanager
                       (Windows) or the kill command (UNIX-like systems). It
                       cannot be used in conjunction with --quickstart.
   --headless          Starts in "headless mode" which allows using the
                       application without GUI. This special mode can be used
                       when the application is controlled by external clients
                       via the API.
   --norestore         Disables restart and file recovery after a system crash.
   --safe-mode         Starts in a safe mode, i.e. starts temporarily with a
                       fresh user profile and helps to restore a broken
                       configuration.
   --accept={UNO-URL}  Specifies an UNO-URL connect-string to create an UNO
                       acceptor through which other programs can connect to
                       access the API. UNO-URL is string the such kind
                   uno:connection-type,params;protocol-name,params;ObjectName.
   --unaccept={UNO-URL} Closes an acceptor that was created with --accept. Use
                       --unaccept=all to close all open acceptors.
   --language={lang}   Uses specified language, if language is not selected
                       yet for UI. The lang is a tag of the language in IETF
                       language tag.

Developer arguments:
   --terminate_after_init
                       Exit after initialization complete (no documents loaded).
   --eventtesting      Exit after loading documents.

New document creation arguments:
The arguments create an empty document of specified kind. Only one of them may
be used in one command line. If filenames are specified after an argument,
then it tries to open those files in the specified component.
   --writer            Creates an empty Writer document.
   --calc              Creates an empty Calc document.
   --draw              Creates an empty Draw document.
   --impress           Creates an empty Impress document.
   --base              Creates a new database.
   --global            Creates an empty Writer master (global) document.
   --math              Creates an empty Math document (formula).
   --web               Creates an empty HTML document.

File open arguments:
The arguments define how following filenames are treated. New treatment begins
after the argument and ends at the next argument. The default treatment is to
open documents for editing, and create new documents from document templates.
   -n                  Treats following files as templates for creation of new
                       documents.
   -o                  Opens following files for editing, regardless whether
                       they are templates or not.
   --pt {Printername}  Prints following files to the printer {Printername},
                       after which those files are closed. The splash screen
                       does not appear. If used multiple times, only last
                       {Printername} is effective for all documents of all
                       --pt runs. Also, --printer-name argument of
                       --print-to-file switch interferes with {Printername}.
   -p                  Prints following files to the default printer, after
                       which those files are closed. The splash screen does
                       not appear. If the file name contains spaces, then it
                       must be enclosed in quotation marks.
   --view              Opens following files in viewer mode (read-only).
   --show              Opens and starts the following presentation documents
                       of each immediately. Files are closed after the showing.
                       Files other than Impress documents are opened in
                       default mode , regardless of previous mode.
   --convert-to OutputFileExtension[:OutputFilterName]
     [--outdir output_dir] [--convert-images-to]
                       Batch convert files (implies --headless). If --outdir
                       isn't specified, then current working directory is used
                       as output_dir. If --convert-images-to is given, its
                       parameter is taken as the target MIME format for *all*
                       images written to the output format. If --convert-to is
                       used more than once, the last value of OutputFileExtension
                       [:OutputFilterName] is effective. If --outdir is used more
                       than once, only its last value is effective. For example:
                   --convert-to pdf *.odt
                   --convert-to epub *.doc
                   --convert-to pdf:writer_pdf_Export --outdir /home/user *.doc
                   --convert-to "html:XHTML Writer File:UTF8" *.doc
                   --convert-to "txt:Text (encoded):UTF8" *.doc
   --print-to-file [--printer-name printer_name] [--outdir output_dir]
                       Batch print files to file. If --outdir is not specified,
                       then current working directory is used as output_dir.
                       If --printer-name or --outdir used multiple times, only
                       last value of each is effective. Also, {Printername} of
                       --pt switch interferes with --printer-name.
   --cat               Dump text content of the following files to console
                       (implies --headless). Cannot be used with --convert-to.
   --script-cat        Dump text content of any scripts embedded in the files to console
                       (implies --headless). Cannot be used with --convert-to.
   -env:<VAR>[=<VALUE>] Set a bootstrap variable. For example: to set
                       a non-default user profile path:
                       -env:UserInstallation=file:///tmp/test

Ignored switches:
   -psn                Ignored (MacOS X only).
   -Embedding          Ignored (COM+ related; Windows only).
   --nofirststartwizard Does nothing, accepted only for backward compatibility.
   --protector {arg1} {arg2}
                       Used only in unit tests and should have two arguments.

 

 

4.命令行轉換pdf

 轉換到當前目錄:windows

liqiang@root MINGW64 ~/Desktop/新建文件夾 (3)
$ soffice --headless --convert-to pdf ./Java開發-太原科技大學-軟件工程-喬利強.docx
convert C:\Users\liqiang\Desktop\▒½▒▒ļ▒▒▒ (3)\Java▒▒▒▒-̫ԭ▒Ƽ▒▒▒ѧ-▒▒▒▒▒▒▒-▒▒▒▒ǿ.docx -> C:\Users\liqiang\Desktop\▒½▒▒ļ▒▒▒ (3)\Java▒▒▒▒-̫ԭ▒Ƽ▒▒▒ѧ-▒▒▒▒▒▒▒-▒▒▒▒ǿ.pdf using filter : writer_pdf_Export
func=xmlSecCheckVersionExt:file=..\src\xmlsec.c:line=188:obj=unknown:subj=unknown:error=19:invalid version:mode=abi compatible;expected minor version=2;real minor version=2;expected subminor version=25;real subminor version=26

liqiang@root MINGW64 ~/Desktop/新建文件夾 (3)
$ ls
Java開發-太原科技大學-軟件工程-喬利強.docx
Java開發-太原科技大學-軟件工程-喬利強.pdf

 

 

若是須要轉換到指定目錄能夠加--outdir參數

 

5.java程序實現word轉pdf(原理是經過cmd調用上述命令)

import java.io.IOException;
import java.io.InputStream;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public final class Test {
    private static final Logger logger = LoggerFactory.getLogger(Test.class);

    public static void main(String[] args) throws NullPointerException {
        long start = System.currentTimeMillis();
        String srcPath = "C:/Users/liqiang/Desktop/ww/tt.docx", desPath = "C:/Users/liqiang/Desktop/ww";
        String command = "";
        String osName = System.getProperty("os.name");
        if (osName.contains("Windows")) {
            command = "soffice --headless --convert-to pdf " + srcPath + " --outdir " + desPath;
            exec(command);
        }
        long end = System.currentTimeMillis();
        logger.debug("用時:{} ms", end - start);
    }

    public static boolean exec(String command) {
        Process process;// Process能夠控制該子進程的執行或獲取該子進程的信息
        try {
            logger.debug("exec cmd : {}", command);
            process = Runtime.getRuntime().exec(command);// exec()方法指示Java虛擬機建立一個子進程執行指定的可執行程序,並返回與該子進程對應的Process對象實例。
            // 下面兩個能夠獲取輸入輸出流
            InputStream errorStream = process.getErrorStream();
            InputStream inputStream = process.getInputStream();
        } catch (IOException e) {
            logger.error(" exec {} error", command, e);
            return false;
        }

        int exitStatus = 0;
        try {
            exitStatus = process.waitFor();// 等待子進程完成再往下執行,返回值是子線程執行完畢的返回值,返回0表示正常結束
            // 第二種接受返回值的方法
            int i = process.exitValue(); // 接收執行完畢的返回值
            logger.debug("i----" + i);
        } catch (InterruptedException e) {
            logger.error("InterruptedException  exec {}", command, e);
            return false;
        }

        if (exitStatus != 0) {
            logger.error("exec cmd exitStatus {}", exitStatus);
        } else {
            logger.debug("exec cmd exitStatus {}", exitStatus);
        }

        process.destroy(); // 銷燬子進程
        process = null;

        return true;
    }

}

 

另外一種命令的方式爲  cmd /c soffice ..... .

另外寫的時候最好pdf後面跟上  :writer_pdf_Export,例如: --convert-to pdf:writer_pdf_Export  可能會在轉換失敗後調用過濾器重寫。

 

 結果:

2018-10-25 21:56:35 [Test]-[DEBUG] exec cmd : soffice --headless --convert-to pdf C:/Users/liqiang/Desktop/ww/tt.docx --outdir C:/Users/liqiang/Desktop/ww
2018-10-25 21:56:45 [Test]-[DEBUG] i----0
2018-10-25 21:56:45 [Test]-[DEBUG] exec cmd exitStatus 0
2018-10-25 21:56:45 [Test]-[DEBUG] 用時:9980 ms

 

 

 2.linux實現將word轉爲pdf,以centos爲例

1.linux下安裝libreoffice

1.下載

  咱們安裝採用yum安裝,首先下載rpm包。這裏須要三個包。

wget http://mirrors.ustc.edu.cn/tdf/libreoffice/stable/6.0.6/rpm/x86_64/LibreOffice_6.0.6_Linux_x86-64_rpm.tar.gz
wget http://mirrors.ustc.edu.cn/tdf/libreoffice/stable/6.0.6/rpm/x86_64/LibreOffice_6.0.6_Linux_x86-64_rpm_sdk.tar.gz
wget http://mirrors.ustc.edu.cn/tdf/libreoffice/stable/6.0.6/rpm/x86_64/LibreOffice_6.0.6_Linux_x86-64_rpm_langpack_zh-CN.tar.gz

 

  

  其實咱們在windows下經過瀏覽器訪問上面連接也是能夠下載tar.gz包的,若是須要不一樣的版本只須要修改url上的版本號便可。好比我想下載6.0.3的我能夠訪問下面url:

http://mirrors.ustc.edu.cn/tdf/libreoffice/stable/6.0.3/rpm/x86_64/LibreOffice_6.0.3_Linux_x86-64_rpm.tar.gz

 

  其實好多時候咱們採用wget下載的時候若是下載不下來, 咱們能夠先在windows下訪問url下載完只會傳到linux服務器,這也是一種思路。

2.上傳到服務器並解壓

採用  tar -xvf xxxxxx.tar.gz解壓便可。解壓結果以下:

[root@VM_0_12_centos libreoffice]# ll
total 263748
drwxr-xr-x 4 root root      4096 Jul 28 06:07 LibreOffice_6.0.6.2_Linux_x86-64_rpm
drwxr-xr-x 3 root root      4096 Jul 28 07:32 LibreOffice_6.0.6.2_Linux_x86-64_rpm_langpack_zh-CN
drwxr-xr-x 3 root root      4096 Jul 28 06:27 LibreOffice_6.0.6.2_Linux_x86-64_rpm_sdk
-rw-r--r-- 1 root root    798421 Oct 25 12:13 LibreOffice_6.0.6_Linux_x86-64_rpm_langpack_zh-CN.tar.gz
-rw-r--r-- 1 root root  36919386 Oct 25 12:24 LibreOffice_6.0.6_Linux_x86-64_rpm_sdk.tar.gz
-rw-r--r-- 1 root root 213845646 Oct 25 10:33 LibreOffice_6.0.6_Linux_x86-64_rpm.tar.gz

 

 

3.採用yum localinstall *.rpm安裝rpm文件

[root@VM_0_12_centos RPMS]# pwd
/opt/libreoffice/LibreOffice_6.0.6.2_Linux_x86-64_rpm/RPMS
[root@VM_0_12_centos RPMS]# yum localinstall *.rpm

 

  RPMS下存放的是須要安裝的rpm文件,進入該文件夾下采用通配符的方式安裝便可。(三個tar.gz解壓後的都須要安裝)

 

4.測試libreoffice

[root@VM_0_12_centos RPMS]# libreoffice6.0 -help
Warning: -help is deprecated.  Use --help instead.
LibreOffice 6.0.6.2 0c292870b25a325b5ed35f6b45599d2ea4458e77

Usage: soffice [argument...]
       argument - switches, switch parameters and document URIs (filenames).

Using without special arguments:
Opens the start center, if it is used without any arguments.
   {file}              Tries to open the file (files) in the components
                       suitable for them.
   {file} {macro:///Library.Module.MacroName}
                       Opens the file and runs specified macros from
                       the file.

Getting help and information:
   --help | -h | -?    Shows this help and quits.
   --helpwriter        Opens built-in or online Help on Writer.
   --helpcalc          Opens built-in or online Help on Calc.
   --helpdraw          Opens built-in or online Help on Draw.
   --helpimpress       Opens built-in or online Help on Impress.
   --helpbase          Opens built-in or online Help on Base.
   --helpbasic         Opens built-in or online Help on Basic scripting
                       language.
   --helpmath          Opens built-in or online Help on Math.
   --version           Shows the version and quits.
   --nstemporarydirectory
                       (MacOS X sandbox only) Returns path of the temporary
                       directory for the current user and exits. Overrides
                       all other arguments.

General arguments:
   --quickstart[=no]   Activates[Deactivates] the Quickstarter service.
   --nolockcheck       Disables check for remote instances using one
                       installation.
   --infilter={filter} Force an input filter type if possible. For example:
                       --infilter="Calc Office Open XML"
                       --infilter="Text (encoded):UTF8,LF,,,"
   --pidfile={file}    Store soffice.bin pid to {file}.
   --display {display} Sets the DISPLAY environment variable on UNIX-like
                       platforms to the value {display} (only supported by a
                       start script).

User/programmatic interface control:
   --nologo            Disables the splash screen at program start.
   --minimized         Starts minimized. The splash screen is not displayed.
   --nodefault         Starts without displaying anything except the splash
                       screen (do not display initial window).
   --invisible         Starts in invisible mode. Neither the start-up logo nor
                       the initial program window will be visible. Application
                       can be controlled, and documents and dialogs can be
                       controlled and opened via the API. Using the parameter,
                       the process can only be ended using the taskmanager
                       (Windows) or the kill command (UNIX-like systems). It
                       cannot be used in conjunction with --quickstart.
   --headless          Starts in "headless mode" which allows using the
                       application without GUI. This special mode can be used
                       when the application is controlled by external clients
                       via the API.
   --norestore         Disables restart and file recovery after a system crash.
   --safe-mode         Starts in a safe mode, i.e. starts temporarily with a
                       fresh user profile and helps to restore a broken
                       configuration.
   --accept={UNO-URL}  Specifies an UNO-URL connect-string to create an UNO
                       acceptor through which other programs can connect to
                       access the API. UNO-URL is string the such kind
                   uno:connection-type,params;protocol-name,params;ObjectName.
   --unaccept={UNO-URL} Closes an acceptor that was created with --accept. Use
                       --unaccept=all to close all open acceptors.
   --language={lang}   Uses specified language, if language is not selected
                       yet for UI. The lang is a tag of the language in IETF
                       language tag.

Developer arguments:
   --terminate_after_init
                       Exit after initialization complete (no documents loaded).
   --eventtesting      Exit after loading documents.

New document creation arguments:
The arguments create an empty document of specified kind. Only one of them may
be used in one command line. If filenames are specified after an argument,
then it tries to open those files in the specified component.
   --writer            Creates an empty Writer document.
   --calc              Creates an empty Calc document.
   --draw              Creates an empty Draw document.
   --impress           Creates an empty Impress document.
   --base              Creates a new database.
   --global            Creates an empty Writer master (global) document.
   --math              Creates an empty Math document (formula).
   --web               Creates an empty HTML document.

File open arguments:
The arguments define how following filenames are treated. New treatment begins
after the argument and ends at the next argument. The default treatment is to
open documents for editing, and create new documents from document templates.
   -n                  Treats following files as templates for creation of new
                       documents.
   -o                  Opens following files for editing, regardless whether
                       they are templates or not.
   --pt {Printername}  Prints following files to the printer {Printername},
                       after which those files are closed. The splash screen
                       does not appear. If used multiple times, only last
                       {Printername} is effective for all documents of all
                       --pt runs. Also, --printer-name argument of
                       --print-to-file switch interferes with {Printername}.
   -p                  Prints following files to the default printer, after
                       which those files are closed. The splash screen does
                       not appear. If the file name contains spaces, then it
                       must be enclosed in quotation marks.
   --view              Opens following files in viewer mode (read-only).
   --show              Opens and starts the following presentation documents
                       of each immediately. Files are closed after the showing.
                       Files other than Impress documents are opened in
                       default mode , regardless of previous mode.
   --convert-to OutputFileExtension[:OutputFilterName]
     [--outdir output_dir] [--convert-images-to]
                       Batch convert files (implies --headless). If --outdir
                       isn't specified, then current working directory is used
                       as output_dir. If --convert-images-to is given, its
                       parameter is taken as the target MIME format for *all*
                       images written to the output format. If --convert-to is
                       used more than once, the last value of OutputFileExtension
                       [:OutputFilterName] is effective. If --outdir is used more
                       than once, only its last value is effective. For example:
                   --convert-to pdf *.odt
                   --convert-to epub *.doc
                   --convert-to pdf:writer_pdf_Export --outdir /home/user *.doc
                   --convert-to "html:XHTML Writer File:UTF8" *.doc
                   --convert-to "txt:Text (encoded):UTF8" *.doc
   --print-to-file [--printer-name printer_name] [--outdir output_dir]
                       Batch print files to file. If --outdir is not specified,
                       then current working directory is used as output_dir.
                       If --printer-name or --outdir used multiple times, only
                       last value of each is effective. Also, {Printername} of
                       --pt switch interferes with --printer-name.
   --cat               Dump text content of the following files to console
                       (implies --headless). Cannot be used with --convert-to.
   --script-cat        Dump text content of any scripts embedded in the files to console
                       (implies --headless). Cannot be used with --convert-to.
   -env:<VAR>[=<VALUE>] Set a bootstrap variable. For example: to set
                       a non-default user profile path:
                       -env:UserInstallation=file:///tmp/test

Ignored switches:
   -psn                Ignored (MacOS X only).
   -Embedding          Ignored (COM+ related; Windows only).
   --nofirststartwizard Does nothing, accepted only for backward compatibility.
   --protector {arg1} {arg2}
                       Used only in unit tests and should have two arguments.

 

  安裝後的命令是libreoffice6.0

 

5.爲了使用libreoffice咱們建立別名

[root@VM_0_12_centos ~]# alias libreoffice='libreoffice6.0'
[root@VM_0_12_centos ~]# alias
alias cp='cp -i'
alias egrep='egrep --color=auto'
alias fgrep='fgrep --color=auto'
alias grep='grep --color=auto'
alias l.='ls -d .* --color=auto'
alias libreoffice='libreoffice6.0'
alias ll='ls -l --color=auto'
alias ls='ls --color=auto'

 

 

2.linux下面命令行測試word轉pdf(其參數與windows下的參數大致相同)

[root@VM_0_12_centos tmpFile]# ls
tt.docx
[root@VM_0_12_centos tmpFile]# libreoffice6.0 --convert-to pdf:writer_pdf_Export ./tt.docx
func=xmlSecCheckVersionExt:file=xmlsec.c:line=188:obj=unknown:subj=unknown:error=19:invalid version:mode=abi compatible;expected minor version=2;real minor version=2;expected subminor version=25;real subminor version=26
convert /root/tmpFile/tt.docx -> /root/tmpFile/tt.pdf using filter : writer_pdf_Export
[root@VM_0_12_centos tmpFile]# ls
tt.docx  tt.pdf
[root@VM_0_12_centos tmpFile]#

 

 

  咱們將上面生成的pdf傳回windows下面查看發現中文亂碼。

 

3.關於word轉pdf中文亂碼問題的解決辦法

1.查看fonts目錄

[root@VM_0_12_centos tmpFile]# cat /etc/fonts/fonts.conf | grep fon
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<!-- /etc/fonts/fonts.conf file to configure system font access -->
<fontconfig>
        problems to the fontconfig bugzilla system located at fontconfig.org
        Note that the normal 'make install' procedure for fontconfig is to
        replace any existing fonts.conf file with the new version. Place
        <dir>/usr/share/fonts</dir>
        <dir>/usr/share/X11/fonts/Type1</dir> <dir>/usr/share/X11/fonts/TTF</dir> <dir>/usr/local/share/fonts</dir>
        <dir prefix="xdg">fonts</dir>
        <dir>~/.fonts</dir>
        <include ignore_missing="yes">/etc/fonts/conf.d</include>
        <cachedir>/var/cache/fontconfig</cachedir>
        <cachedir prefix="xdg">fontconfig</cachedir>
        <cachedir>~/.fontconfig</cachedir>
  in fonts.  All other blank chars are assumed to be broken and
</fontconfig>

 

 

發現上面的字體存在/usr/share/fonts目錄下。

 

2.把Windows下的字體C:\Windows\Fonts下的宋體,即simsun.ttc上傳到linux服務器並賦值到上面的字體目錄下賦予讀寫權限

[root@VM_0_12_centos libreoffice]# ll | grep simsun.ttc
-rw-r--r-- 1 root root  18214472 Oct 25 13:19 simsun.ttc

 

cp simsun.ttc /usr/share/fonts

 

cd /usr/share/fonts

 

賦予權限(默認權限也能夠,若是不能夠就手動賦予權限便可)

chmod 644 simsun.ttc

 

3.更新字體緩存

fc-cache -fv

 

 

  再次轉換pdf發現完美解決。

 

4.linux下Java程序調用libreoffice轉換pdf

  文件的位置與輸出目錄經過主函數參數傳遞進去。

 (1)先寫一個簡單的程序進行測試

import java.io.IOException;

public class Test {

    public static void main(String[] args) throws NullPointerException {
        String filePath = args[0];
        String destDir = args[1];
        String osName = System.getProperty("os.name");
        System.out.println(filePath);
        System.out.println(destDir);
        System.out.println(osName);
        String cmd = "libreoffice6.0 --convert-to pdf:writer_pdf_Export " + filePath + " --outdir " + destDir;
        System.out.println(cmd);
        try {
            Runtime.getRuntime().exec(cmd);
        } catch (IOException e) {
            System.err.println(e.getMessage());
        }
    }

}

 

咱們在linux下面進行編譯而且運行:

[root@VM_0_12_centos tmpFile]# javac Test.java
[root@VM_0_12_centos tmpFile]# java Test ./tt.docx ./
./tt.docx
./
Linux
libreoffice6.0 --convert-to pdf:writer_pdf_Export ./tt.docx --outdir ./
[root@VM_0_12_centos tmpFile]# ls
Test.class  Test.java  tt.docx  tt.pdf

 

 

 (2)接下來簡單的編寫程序獲取轉換時間:(使線程等待抓換完成)

import java.io.IOException;

public class Test {

    public static void main(String[] args) throws NullPointerException {
        long start = System.currentTimeMillis();
        String filePath = args[0];
        String destDir = args[1];
        String osName = System.getProperty("os.name");
        System.out.println(filePath);
        System.out.println(destDir);
        System.out.println(osName);
        String cmd = "libreoffice6.0 --convert-to pdf:writer_pdf_Export " + filePath + " --outdir " + destDir;
        System.out.println(cmd);
        try {
            Process process = Runtime.getRuntime().exec(cmd);
            try {
                // 獲取返回狀態
                int status = process.waitFor();
                // 銷燬process
                process.destroy();
                process = null;
                System.out.println("status -> " + status);
            } catch (InterruptedException e) {
                System.err.println(e.getMessage());
            }
        } catch (IOException e) {
            System.err.println(e.getMessage());
        }
        long end = System.currentTimeMillis();
        System.out.println("用時:" + (end - start) + "ms");
    }

}

 

再次在linux下面編譯運行:

[root@VM_0_12_centos tmpFile]# java Test ./tt.docx ./
./tt.docx
./
Linux
libreoffice6.0 --convert-to pdf:writer_pdf_Export ./tt.docx --outdir ./
status -> 0
用時:1463ms
[root@VM_0_12_centos tmpFile]# ls
Test.class  Test.java  tt.docx  tt.pdf

 

 

至此完成了使用libreoffice在windows與linux下面轉換pdf,這種方式感受比較穩定。同時也學會了Runtime 調用本地程序以單線程方式運行的方法。

 

  文中用到的全部的tar包以及字體simsun.ttc下載地址:http://qiaoliqiang.cn/fileDown/linuxlibreoffice.zip

 

補充:word也能夠轉爲html,測試word轉html

word內容:

 

 

soffice.exe --headless --convert-to html .\通用功能需求收集20180723.docx

 

結果:

 

 補充:word能夠轉jpg

soffice.exe --headless --convert-to jpg .\通用功能需求收集20180723.docx

 

 結果生成jpg:

 

 補充:word能夠轉txt

soffice.exe --headless --convert-to txt .\通用功能需求收集20180723.docx

 

結果:

 

補充:其實excel和ppt也能夠轉爲pdf和html以及jpg,下面研究excel轉換(只是邊框被去掉,若是須要顯示邊框在excel中的樣式須要顯示邊框;並且內容過長會折行,解決辦法就是縮小列寬、減小列數)

原來excel內容:

 

 轉換:

soffice.exe --headless --convert-to jpg ./test.xls

soffice.exe --headless --convert-to html ./test.xls

soffice.exe --headless --convert-to pdf ./test.xls

 

 (1)轉換後的jpg

(2)轉換的html

(3)轉換後的pdf

 

補充:直接拷貝目錄遇到的問題:

  今天拷貝下載好的目錄使用時,發現報錯缺失VCRUNTIME140.dll和MSVCP140.dll,因而拷貝另一臺電腦到缺失的電腦上就能夠了。記住是C:\Windows\System32目錄和C:\Windows\SysWOW64目錄下對應的dll,這兩個文件夾下的dll不同,雖然文件名同樣,可是大小不同,因此要複製對應的dll。

 

補充;java也能夠用jodconverter進行轉換,我用的是jodconverter2.2版本(該工具包依賴openoffice或libreoffice插件)

依賴的jar包以下:

 

 代碼以下:

import java.io.File;
import java.io.IOException;

import com.artofsolving.jodconverter.DocumentConverter;
import com.artofsolving.jodconverter.openoffice.connection.OpenOfficeConnection;
import com.artofsolving.jodconverter.openoffice.connection.SocketOpenOfficeConnection;
import com.artofsolving.jodconverter.openoffice.converter.OpenOfficeDocumentConverter;

public class Office2Pdf {

    // 將word格式的文件轉換爲pdf格式
    public static void WordToPDF(String startFile, String overFile) throws IOException {
        // 源文件目錄
        File inputFile = new File(startFile);
        if (!inputFile.exists()) {
            System.out.println("源文件不存在!");
            return;
        }

        // 輸出文件目錄
        File outputFile = new File(overFile);
        if (!outputFile.getParentFile().exists()) {
            outputFile.getParentFile().exists();
        }

        // 調用openoffice服務線程
        /** 我把openOffice下載到了 C:/Program Files (x86)/下 ,下面的寫法本身修改編輯就能夠 **/
        String command = "D:/zdc8/lo/program/soffice.exe -headless -accept=\"socket,host=127.0.0.1,port=8300;urp;\"";
        Process p = Runtime.getRuntime().exec(command);

        // 鏈接openoffice服務
        OpenOfficeConnection connection = new SocketOpenOfficeConnection("127.0.0.1", 8300);
        connection.connect();

        // 轉換
        DocumentConverter converter = new OpenOfficeDocumentConverter(connection);
        converter.convert(inputFile, outputFile);

        // 關閉鏈接
        connection.disconnect();

        // 關閉進程
        p.destroy();
    }

    public static void main(String[] args) {
        String start = "C:\\Users\\Administrator\\Desktop\\123.xlsx";
        String over = "C:\\Users\\Administrator\\Desktop\\123.xlsx.pdf";
        try {
            WordToPDF(start, over);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

 

若是想去掉留痕,須要反編譯jodconverter-2.2.2.jar,獲取類OpenOfficeDocumentConverter.java,修改方法loadAndExport,以下:(加粗部分是添加的代碼)

    private void loadAndExport(String inputUrl, Map/* <String,Object> */ loadProperties, String outputUrl,
            Map/* <String,Object> */ storeProperties) throws OpenOfficeException {
        XComponent document;
        try {
            document = loadDocument(inputUrl, loadProperties);
        } catch (ErrorCodeIOException errorCodeIOException) {
            throw new OpenOfficeException(
                    "conversion failed: could not load input document; OOo errorCode: " + errorCodeIOException.ErrCode,
                    errorCodeIOException);
        } catch (Exception otherException) {
            throw new OpenOfficeException("conversion failed: could not load input document", otherException);
        }
        if (document == null) {
            throw new OpenOfficeException("conversion failed: could not load input document");
        }

 XPropertySet mxDocProps = (XPropertySet) UnoRuntime.queryInterface(XPropertySet.class, document); try { mxDocProps.setPropertyValue("RedlineDisplayType", RedlineDisplayType.NONE); } catch (Exception e) { throw new OpenOfficeException("dispose RedlineDisplay failed", e); }

        refreshDocument(document);

        try {
            storeDocument(document, outputUrl, storeProperties);
        } catch (ErrorCodeIOException errorCodeIOException) {
            throw new OpenOfficeException(
                    "conversion failed: could not save output document; OOo errorCode: " + errorCodeIOException.ErrCode,
                    errorCodeIOException);
        } catch (Exception otherException) {
            throw new OpenOfficeException("conversion failed: could not save output document", otherException);
        }
    }

 

補充:基於libreoffice和jodconverter的文件在線預覽插件,這個插件功能強大,使用簡單

git地址:  https://github.com/kekingcn/kkFileView

博客地址:  https://my.oschina.net/keking/blog/3064732

相關文章
相關標籤/搜索