Unity利用Sapi進行windows語音開發

時間 2019-12-12

標籤 unity 利用 sapi 進行 windows 語音開發欄目 Windows 简体版

原文原文鏈接

軟件中的語音技術主要包含兩種：語音識別speech recognition和語音合成speech synthesis。通常地，開發者會由於技術實力和資金實力等各方面的問題無力完成專業的語音引擎，所以一般選擇現有的較爲專業的語音引擎來完成相關的開發，好比國內很是出名的科大訊飛，百度語音等等。固然國外的還有Google語音，微軟有SAPI等等。git

在VR開發過程當中，因爲運行在Windows環境下，那麼天然而然，咱們首選SAPI來進行語音開發。一是和Windows原生，二是離線不須要網絡，三是不須要任何插件。另外就是SAPI發音，尤爲是英文發音，仍是相對來講質量不錯的。（Win7以上自帶）github

使用SAPI，須要使用到的是System.Speech.dll文件。因爲Unity須要將Dll文件放在Asset目錄下，而這樣的結果會發現sapi failed to initialize。緣由懷疑爲須要特定的上下文環境才能運行dll的api，以致於拷貝到Asset目錄致使上下文環境缺失而沒法運行。api

可是若是作過這方面開發的知道，在C#的其餘應用裏面引用System.Speech.dll是徹底沒有問題的。那麼是否是咱們能夠開發一個專門的第三方程序，而後unity進行調用呢？按照這個思路，咱們開發了一個控制檯程序Speech.exe，主要功能是根據輸入文本進行語音合成。服務器

代碼較爲簡單網絡

/*簡單的SAPI語音合成控制檯程序*/socket

using System.Speech.Synthesis;
using SpeechTest.Properties;

namespace SpeechTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var speaker = new SpeechSynthesizer();
            speaker.Speak(「test」);
        }
    }
}

OK，運行就能夠聽到機器發音Test了。測試

咱們修改一下，改成從參數中讀取，這樣的話，咱們能夠在unity中利用Process運行Speech.exe，並傳給Speech參數。this

/*從參數讀取須要發音的文本*/spa

using System.Speech.Synthesis;
using SpeechTest.Properties;

namespace SpeechTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var speaker = new SpeechSynthesizer();
            var res = args.Length == 0 ? "請說" : args[0];
            speaker.Speak(res);
        }
    }
}

咱們先使用CMD命令行，cd到Speech.exe所在的目錄，而後輸入Speech.exe test，如咱們預想的那般，機器發音test。測試經過。插件

爲了可以更改發音的配置，增長一些代碼，從Setting中讀取相關的配置數據，代碼更改以下：

/*可以配置的控制檯程序*/

using System.Speech.Synthesis;
using SpeechTest.Properties;

namespace SpeechTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var speaker = new SpeechSynthesizer();
            speaker.Volume = Settings.Default.SpeakVolume;
            speaker.Rate = Settings.Default.SpeakRate;
            var voice = Settings.Default.SpeakVoice;
            if (!string.IsNullOrEmpty(voice))
                speaker.SelectVoice(voice);
            var res = args.Length == 0 ? "請說" : args[0];
            speaker.Speak(res);
        }
    }
}

接下來咱們在Unity中使用Process來開啓這個Speech.exe，代碼以下：

/*Unity中開啓Speech.exe進程*/

using System.Diagnostics;

public class Speecher: MonoBehaviour
{
    public static void Speak(string str)
    {
        var proc = new Process
        {
            StartInfo = new ProcessStartInfo
            {
                FileName = "speech.exe",
                Arguments = "\"" + str + "\"",
            }
        };
        proc.Start();
    }

    /***測試代碼，可刪除Start***/
    protected void Start()
    {
        Speak("test");
    }
    /***測試代碼，可刪除End***/
}

將腳本掛在任何一個GO（GameObject）上，運行，黑框出現，同時聽到發音，測試完成。

接下來咱們隱藏這個黑框。代碼修改以下：

/*Unity開啓無框的Speech.exe進程*/

using System.Diagnostics;

public class Speecher: MonoBehaviour
{
    public static void Speak(string str)
    {
        var proc = new Process
        {
            StartInfo = new ProcessStartInfo
            {
                FileName = "speech.exe",
                Arguments = "\"" + str + "\"",
                CreateNoWindow = true,
                WindowStyle = ProcessWindowStyle.Hidden,
            }
        };
        proc.Start();
    }
    /***測試代碼，可刪除Start***/
    protected void Start()
    {
        Speak("test");
    }
    /***測試代碼，可刪除End***/
}

其實到了這一步，主要的功能都完成了。可是細心的會發現，這樣不斷建立進程而後關閉進程的方式會不會太笨了。可不可讓Speech這個進程一直開啓着，收到unity的信息時就發音呢？這就涉及到進程間通訊了。

Windows的進程是相互獨立的，各自有各自的分配空間。可是並不意味這不能相互通訊。方法有不少，好比讀寫文件，發送消息（hook），Socket等等。其中Socket實現起來相對簡單，尤爲是咱們已經擁有Socket封裝庫的狀況下，只要少許代碼就好了。

因而在Speech改爲一個Socket服務器，代碼以下：

/*Speech 服務端*/

using System;
using System.Linq;
using System.Speech.Synthesis;
using System.Text;
using Speech.Properties;

namespace Speech
{
    class Program
    {
        static void Main(string[] args)
        {
            var server = new NetServer();
            server.StartServer();

            while (true)
            {
                var res = Console.ReadLine();
                if (res == "exit")
                    break;
            }
        }
    }

    public class NetServer : SocketExtra.INetComponent
    {
        private readonly Speecher m_speecher;

        private readonly SocketExtra m_socket;

        public NetServer()
        {
            m_speecher = new Speecher();
            m_socket = new SocketExtra(this);
        }

        public void StartServer()
        {
            m_socket.Bind("127.0.0.1", Settings.Default.Port);
        }

        public bool NetSendMsg(byte[] sendbuffer)
        {
            return true;
        }

        public bool NetReciveMsg(byte[] recivebuffer)
        {
            var str = Encoding.Default.GetString(recivebuffer);
            Console.WriteLine(str);
            m_speecher.Speak(str);
            return true;
        }

        public bool Connected { get { return m_socket.Connected; } }
    }

    public class Speecher
    {
        private readonly SpeechSynthesizer m_speaker;

        public Speecher()
        {
            m_speaker = new SpeechSynthesizer();
            var installs = m_speaker.GetInstalledVoices();

            m_speaker.Volume = Settings.Default.SpeakVolume;
            m_speaker.Rate = Settings.Default.SpeakRate;
            var voice = Settings.Default.SpeakVoice;

            var selected = false;
            if (!string.IsNullOrEmpty(voice))
            {
                if (installs.Any(install => install.VoiceInfo.Name == voice))
                {
                    m_speaker.SelectVoice(voice);
                    selected = true;
                }
            }
            if (!selected)
            {
                foreach (var install in installs.Where(install => install.VoiceInfo.Culture.Name == "en-US"))
                {
                    m_speaker.SelectVoice(install.VoiceInfo.Name);
                    break;
                }
            }
        }

        public void Speak(string msg)
        {
            m_speaker.Speak(msg);
        }
    }
}

同時修改Unity代碼，增長Socket相關代碼：

/*Unity客戶端代碼*/

using System.Collections;
using System.Diagnostics;
using System.Text;
using UnityEngine;

public class Speecher : MonoBehaviour, SocketExtra.INetComponent
{
    private SocketExtra m_socket;
    private Process m_process;

    protected void Awake()
    {
        Ins = this;
        m_process = new Process
        {
            StartInfo = new ProcessStartInfo
            {
                FileName = "speech.exe",
                CreateNoWindow = true,
                WindowStyle = ProcessWindowStyle.Hidden
            },
        };
        m_process.Start();
    }

    /***測試代碼，可刪除Start***/
    protected IEnumerator Start()
    {
        yield return StartCoroutine(Connect());
        Speak("test");
    }
    /***測試代碼，可刪除End***/

    public IEnumerator Connect()
    {
        m_socket = new SocketExtra(this);
        m_socket.Connect("127.0.0.1", 9903);
        while (!m_socket.Connected)
        {
            yield return 1;
        }
    }

    protected void OnDestroy()
    {
        if (m_process != null && !m_process.HasExited)
            m_process.Kill();
        m_process = null;
    }

    public static Speecher Ins;

    public static void Speak(string str)
    {
#if UNITY_EDITOR||UNITY_STANDALONE_WIN
        Ins.Speech(str);
#endif
    }

    public void Speech(string str)
    {
        if (m_socket.Connected)
        {
            var bytes = Encoding.Default.GetBytes(str);
            m_socket.SendMsg(bytes);
        }
    }

    public bool NetReciveMsg(byte[] recivebuffer)
    {
        return true;
    }

    public bool NetSendMsg(byte[] sendbuffer)
    {
        return true;
    }
}