Unity 2017 Game Optimizations (Chris Dickinson 著)

時間 2019-11-09

標籤 unity game optimizations chris dickinson 欄目遊戲简体版

原文原文鏈接

　　Pursuing Performance Problems, provides an exploration of the Unity Profiler and a series of methods to profile our application, detect performance bottlenecks, and perform root cause analysis.windows

2. Scripting Strategies (已看)api

　　Scripting Strategies, deals with the best practices for our Unity C# Script code, minimizing MonoBehaviour callback overhead, improving interobject communication, and more.app

3. The Benefits of Batchingless

　　The Benefits of Batching, explores Unity's Dynamic Batching and Static Batching systems, and how they can be utilized to ease the burden on the Rendering Pipeline.dom

4. Kickstart Your Artasync

　　Kickstart Your Art, helps you understand the underlying technology behind art assets and learn how to avoid common pitfalls with importing, compression, and encoding.ide

5. Faster Physicsoop

　　Faster Physics, is about investigating the nuances of Unity's internal Physics Engines for both 3D and 2D games, and how to properly organize our physics objects for improved performance.ui

6. Dynamic Graphics

　　Dynamic Graphics, provides an in-depth exploration of the Rendering Pipeline, and how to improve applications that suffer rendering bottlenecks in the GPU, or CPU, how to optimize graphical effects such as lighting, shadows, and Particle Effects, ways in which to ptimize Shader code, and some specific techniques for mobile devices.

7. Virtual Velocity and Augmented Acceleration

　　Virtual Velocity and Augmented Acceleration, focuses on the new entertainment mediums of Virtual Reality (VR) and Augmented Reality (AR), and includes several techniques for optimizing performance that are unique to apps built for these platforms.

8. Masterful Memory Management

　　Masterful Memory Management, examines the inner workings of the Unity Engine, the Mono Framework, and how memory is managed within these components to protect our application from excessive heap allocations and runtime garbage collection.

9. Tactical Tips and Tricks

　　Tactical Tips and Tricks, closes the book with a multitude of useful techniques used by Unity professionals to improve project workflow and scene management.

1. Pursuing Performance Problems

　　The Unity Profiler

　　Launching the Profiler

　　Editor or standalone instances

　　Connecting to a WebGL instance

　　Remote connection to an iOS device

　　Remote connection to an Android device

　　Editor profiling

　　The Profiler window

　　Profiler controls

　　Add Profiler

　　Record

　　Deep Profile

　　Profile Editor

　　Connected Player

　　Clear

　　Load

　　Save

　　Frame Selection

　　Timeline View

　　Breakdown View Controls

　　Breakdown View

　　The CPU Usage Area

　　The GPU Usage Area

　　The Rendering AreaThe Memory Area

　　The Audio Area

　　The Physics 3D and Physics 2D Areas

　　The Network Messages and Network Operations Areas

　　The Video Area

　　The UI and UI Details Areas

　　The Global Illumination Area

　　Best approaches to performance analysis

　　Verifying script presence

　　Verifying script count

　　Verifying the order of events

　　Minimizing ongoing code changes

　　Minimizing internal distractions

　　Minimizing external distractions

　　Targeted profiling of code segments

　　Profiler script control

　　Custom CPU Profiling

　　Final thoughts on Profiling and Analysis

　　Understanding the Profiler

　　Reducing noise

　　Focusing on the issue

　　Summary

2. Scripting Strategies

In this chapter, we will explore ways of applying performance enhancements to the following areas:

Accessing Components
Component callbacks (Update(), Awake(), and so on)
Coroutines
GameObject and Transform usage
Interobject communication
Mathematical calculations
Deserialization such as Scene and Prefab loading

　　Obtain Components using the fastest method

　　Remove empty callback definitions

　　Cache Component references

private Rigidbody rigidbody;

void Awake() {
    rigidbody = GetComponent<Rigidbody>();   
}

void Update() {
    rigidbody.xxx;
}

View Code

　　Share calculation output

　　Update, Coroutines, and InvokeRepeating

void Update() {
    ProcessAI();
}

private float _aiProcessDelay = 0.2f;
private float _timer = 0.0f;

void Update() {
    _timer += Time.deltaTime;
    if (_timer > _aiProcessDelya) {
        ProcessA();
        _timer -= _aiProcessDelay;
    }
}

void Start() {
    StartCoroutine(ProcessAICoroutine());
}

IEnumerator ProcessAICoroutine() {
    while (true) {
        ProcessAI();
        yield return new WaitForSeconds(_aiProcessDelay);
    }
}

void Start() {
    InvokeRepeating("ProcessAI", 0f, _aiProcessDelay);
}

View Code

　　Faster GameObject null reference checks

        if (!System.Object.ReferenceEquals(gameObject, null)) {
            // do something
        }

View Code

　　Avoid retrieving string properties from GameObjects

　　Use appropriate data structures

　　Avoid re-parenting Transforms at runtime

GameObject.Instantiate(Object original, Transform parent);

transform.hierarchyCapacity;

View Code

　　Consider caching Transform changes

　　Avoid Find() and SendMessage() at runtime

　　　　Assigning references to pre-existing objects

　　　　Static Classes

using UnityEngine;
public class EnemyCreatorComponent : MonoBehaviour {
    [SerializeField] private int _numEnemies;
    [SerializeField] private GameObject _enemyPrefab;
    [SerializeField] private EnemyManagerComponent _enemyManager;
    void Start() {
        for (int i = 0; i < _numEnemies; ++i) {
            CreateEnemy();
        }
    }
    p
  ublic void CreateEnemy() {
        _enemyManager.CreateEnemy(_enemyPrefab);
    }
}

View Code

　　　　Singleton Components

using UnityEngine;
public class SingletonComponent<T> : MonoBehaviour where T : SingletonComponent<T> {
    private static T __Instance;
    protected static SingletonComponent<T> _Instance {
        get {
            if (!__Instance) {
                T[] managers = GameObject.FindObjectsOfType(typeof(T)) as T[];
                if (managers != null) {
                    if (managers.Length == 1) {
                        __Instance = managers[0];
                        return __Instance;
                    } else if (managers.Length > 1) {
                        Debug.LogError("You have more than one " +
                        typeof(T).Name +
                        " in the Scene. You only need " +
                        "one - it's a singleton!");
                        for (int i = 0; i < managers.Length; ++i) {
                            T manager = managers[i];
                            Destroy(manager.gameObject);
                        }
                    }
                }

                GameObject go = new GameObject(typeof(T).Name, typeof(T));
                __Instance = go.GetComponent<T>();
                DontDestroyOnLoad(__Instance.gameObject);
            }
            return __Instance;
        }
        set {
            __Instance = value as T;
        }
    }
}

public class EnemyManagerSingletonComponent : SingletonComponent<EnemyManagerSingletonComponent>
    public static EnemyManagerSingletonComponent Instance {
        get { return ((EnemyManagerSingletonComponent)_Instance); }
        set { _Instance = value; }
    }

    public void CreateEnemy(GameObject prefab) {
        // same as StaticEnemyManager
    }

    public void KillAll() {
        // same as StaticEnemyManager
    }
}

View Code

　　　　A global Messaging System

public class Message {
    public string type;
    public Message() { type = this.GetType().Name; }
}

View Code

Moving on to our MessageSystem class, we should define its features by what kind of requirements we need it to fulfill:

It should be globally accessible
Any object (MonoBehaviour or not) should be able to register/deregister as linsteners to receive specific message types(that is, the Observer design pattern)
Registering objects should provide a method to call when the given message is broadcasted from elsewhere
The system should send the message to all listeners within a reasonable time frame, but not choke on too many requests at once

　　　　　　A globally accessible objectRegistration

　　　　　　Registration

public delegate bool MessageHandlerDelegate(Message message);

View Code

　　　　　　Message processing

　　　　　　Implementing the Messaging System

using System.Collections.Generic;
using UnityEngine;

public class MessagingSystem : SingletonComponent<MessagingSystem> {

    public static MessagingSystem Instance {
        get { return ((MessagingSystem)_Instance); }
        set { _Instance = value; }
    }

    private Dictionary<string, List<MessageHandlerDelegate>> _listenerDict = new Dictionary<string, List<MessageHandlerDelegate>>();

    public bool AttachListener(System.Type type, MessageHandlerDelegate handler) {
        if (type == null) {
            Debug.Log("MessagingSystem: AttachListener failed due to having no " +
            "message type specified");
            return false;
        }

        string msgType = type.Name;

        if (!_listenerDict.ContainsKey(msgType)) {
            _listenerDict.Add(msgType, new List<MessageHandlerDelegate>());
        }

        List<MessageHandlerDelegate> listenerList = _listenerDict[msgType];
        if (listenerList.Contains(handler)) {
            return false; // listener already in list
        }

        listenerList.Add(handler);

        return true;
    }
}

View Code

　　　　　　Message queuing and processing

private Queue<Message> _messageQueue = new Queue<Message>();

public bool QueueMessage(Message msg) {
    if (!_listenerDict.ContainsKey(msg.type)) {
        return false;
    }
    _messageQueue.Enqueu(msg);
    return true;
}

private const int _maxQueueProcessingTime = 16667;
private System.Diagnostics.Stopwatch timer = new System.Diagnostics.Stopwatch();

void Update() {
    timer.Start();
    while (_messageQueue.Count > 0) {
        if (_maxQueueProcessingTime > 0.0f) {
            if (timer.Elapsed.Milliseconds > _maxQueueProcessingTime) {
                timer.Stop();
                return;
            }
        }

        Message msg = _messageQueue.Dequeue();
        if (!TriggerMessage(msg)) {
            Debug.Log("Error when processing message: " + msg.type);
        }
    }
}

public bool TriggerMessage(Message msg) {
    string msgType = msg.type;

    if (!_listenerDict.ContainKey(msgType)) {
        Debug.Log("MessagingSystem: Message \"" + msgType + "\" has no listeners!");
        return false;
    }

    List<MessageHandlerDelegate> listenerList = _listenerDict[msgType];
    for (int i = 0; i < listenerList.Count; ++i) {
        if (listenerList[i](msg)) {
            return true;
        }
    }

    return true;
}

View Code

　　　　　　Implementing custom messages

public class CreateEnemyMessage : Message { }
public class EnemyCreatedMessage : Message {
    public readonly GameObject enemyObject;
    public readonly string enemyName;
    public EnemyCreatedMessage(GameObject enemyObject, string enemyName) {
        this.enemyObject = enemyObject;
        this.enemyName = enemyName;
    }
}

View Code

　　　　　　Message sending

public class EnemyCreatorComponent : MonoBehaviour {
    void Update() {
        if (Input.GetKeyDown(KeyCode.Space)) {
            MessagingSystem.Instance.QueueMessage(new CreateEnemyMessage());
        }
    }
}

View Code

　　　　　　Message registration

public class EnemyManagerWithMessagesComponent : MonoBehaviour {

    private List<GameObject> _enemies = new List<GameObject>();

    [SerializeField] private GameObject _enemyPrefab;

    void Start() {
        MessagingSystem.Instance.AttachListener(typeof(CreateEnemyMessage),this.HandleCreateEnemy);
    }

    bool HandleCreateEnemy(Message msg) {
        CreateEnemyMessage castMsg = msg as CreateEnemyMessage;
        string[] names = { "Tom", "Dick", "Harry" };
        GameObject enemy = GameObject.Instantiate(_enemyPrefab,5.0f * Random.insideUnitSphere,Quaternion.identity);
        string enemyName = names[Random.Range(0, names.Length)];
        enemy.gameObject.name = enemyName;
        _enemies.Add(enemy);
        MessagingSystem.Instance.QueueMessage(new EnemyCreatedMessage(enemy,enemyName));
        return true;
    }
}

public class EnemyCreatedListenerComponent : MonoBehaviour {

    void Start() {
        MessagingSystem.Instance.AttachListener(typeof(EnemyCreatedMessage),
        HandleEnemyCreated);
    }

    bool HandleEnemyCreated(Message msg) {
        EnemyCreatedMessage castMsg = msg as EnemyCreatedMessage;
        Debug.Log(string.Format("A new enemy was created! {0}", castMsg.enemyName));
        return true;
    }
}

View Code

　　　　　　Message cleanup

public bool DetachListener(System.Type type, MessageHandlerDelegate handler) {
    if (type == null) {
        Debug.Log("MessagingSystem: DetachListener failed due to having no " + "message type specified");
        return false;
    }

    string msgType = type.Name;

    if (!_listenerDict.ContainsKey(type.Name)) {
        return false;
    }

    List<MessageHandlerDelegate> listenerList = _listenerDict[msgType]; if (!listenerList.Contains(handler)) {
        return false;
    }

    listenerList.Remove(handler);
    return true;
}

void OnDestroy() {
    if (MessagingSystem.IsAlive) {
        MessagingSystem.Instance.DetachListener(typeof(EnemyCreatedMessage),this.HandleCreateEnemy);
    }
}

View Code

　　　　　　Wrapping up the Messaging System

　　Disable unused scripts and objects

　　　　Disabling objects by visibility

void OnBecameVisible() { enabled = true; }
void OnBecameInvisible() { enabled = false; }

void OnBecameVisible() { gameObject.SetActive(true); }
void OnBecameInvisible() { gameObject.SetActive(false); }

View Code

　　　　Disabling objects by distance

[SerializeField] GameObject _target;
[SerializeField] float _maxDistance;
[SerializeField] int _coroutineFrameDelay;

void Start() {
    StartCoroutine(DisableAtADistance());
}

IEnumerator DisableAtADistance() {
    while (true) {
        float distSqrd = (transform.position - _target.transform.position).sqrMagnitude;

        if (distSqrd < _maxDistance * _maxDistance) {
            enabled = true;
        } else {
            enabled = false;
        }

        for(int i = 0; i < _coroutineFrameDelay; ++i) {
            yield return new WaitForEndOfFrame();
        }
    }
}

View Code

　　Consider using distance-squared over distance

　　Minimize Deserialization behavior

Unity's Serialization system is mainly used for Scenes, Prefabs,ScriptableObjects and various Asset types(which tend to derive from ScriptableObject).

When one of these object types is saved to disk, it is converted into a text file using the Yet Another Markup Language (YAML) format, which can be deserialized back into the original object type at a later time.

All GameObjects and their properties get serialized when a Prefab or Scene is serialized, including private and protected fields, all of their Components as well as its child GameObjects and their Components, and so on.

When our application is built, this serialized data is bundled together in large binary data files internally called Serialized Files in Unity.

Reading and deserializing this data from disk at runtime is an incredibly slow process (relatively speaking) and so all deserialization activity comes with a significant performance cost.

This kind of deserialization takes place any time we call Resources.Load() for a file path found under a folder named Resources.

Once the data has been loaded from disk into memory, then reloading the same reference later is much faster, but disk activity is always required the first time it is accessed.

Naturally, the larger the data set we need to deserialize, the longer this process takes.

Since every Component of a Prefab gets serialized, then the deeper the hierarchy is, the more data needs to be deserialized.

This can be a problem for Prefabs with very deep hierarchies, Prefabs with many empty GameObjects (since every GameObject always contains at least a Transform Component), and particularly problematic for User Interface(UI) Prefabs, since they tend to house many more Components than a typical Prefab.

Loading in large serialized data sets like these could cause a significant spike in CPU the first time they are loaded, which tend to increase loading time if they're needed immediately at the start of the Scene.

More importantly, they can cause frame drops if they are loaded at runtime.

There are a couple ofapproaches we can use to minimize the costs of deserialization.

　　　　Reduce serialized object size

　　　　Load serialized objects asynchronously

　　　　Keep previously loaded serialized objects in memory

　　　　Move common data into ScriptableObjects

　　Load scenes additively and asynchronously

　　Create a custom Update() layer

Earlier in this chapter, in the "Update, Coroutines and InvokeRepeating" section, we discussed the relative pros and cons of using these Unity Engine features as a means of avoiding excessive CPU workload during most of our frames.

Regardless of which of these approaches we might adopt, there is an additional risk of having lots of MonoBehaviours written to periodically call some function, which is having too many methods triggering in the same frame simultaneously.

Imagine thousands of MonoBehaviours that initialized together at the start of a Scene, each starting a Coroutine at the same time that will process their AI tasks every 500 milliseconds.

It is highly likely that they would all trigger within the same frame, causing a huge spike in its CPU usage for a moment, which settles down temporarily and then spikes again a few moments later when the next round of AI processing is due.

Ideally, we would want to spread these invocations out over time.

The following are the possible solutions to this problem: Generating a random time to wait each time the timer expires or Coroutine triggers Spread out Coroutine initialization so that only a handful of them are started at each frame Pass the responsibility of calling updates to some God Class that places a limit on the number of invocations that occur each frame

The first two options are appealing since they’re relatively simple and we know that Coroutines can potentially save us a lot of unnecessary overhead.

However, as we discussed, there are many dangers and unexpected side effects associated with such drastic design changes.

A potentially better approach to optimize updates is to not use Update() at all, or more accurately, to use it only once.

When Unity calls Update(), and in fact, any of its callbacks, it crosses the aforementioned Native-Managed Bridge,which can be a costly task.

In other words, the processing cost of executing 1,000 separate Update() callbacks will be more expensive than executing one Update() callback, which calls into 1,000 regular functions.

As we witnessed in the "Remove empty callback definitions" section, calling Update() thousands of times is not a trivial amount of work for the CPU to undertake, primarily because of the Bridge.

We can, therefore, minimize how often Unity needs to cross the Bridge by having a God Class MonoBehaviour use its own Update() callback to call our own custom updatestyle system used by our custom Components.

In fact, many Unity developers prefer implementing this design right from the start of their projects, as it gives them finer control over when and how updates propagate throughout the system; this can be used for things such as menu pausing, cool time manipulation effects, or prioritizing important tasks and/or suspending low priority tasks if we detect that we’re about to reach our CPU budget for the current frame.

All objects wanting to integrate with such a system must have a common entry point.

We can achieve this through an Interface Class with the interface keyword.

Interface Classes essentially set up a contract whereby any class that implements the Interface Class Class must provide a specific series of methods.

In other words, if we know the object implements an Interface Class, then we can be certain about what methods are available.

In C#, classes can only derive from a single base class, but they can implement any number of Interface Classes (this avoids the deadly diamond of death problem that C++ programmers will be familiar with).

The following Interface Class definition will suffice, which only requires the implementing class to define a single method called OnUpdate():

public interface IUpdateable {
void OnUpdate(float dt);
}

It’s common practice to start an Interface Class definition with a capital ‘I’ to make it clear that it is an Interface Class we’re dealing with.

The beauty of Interface Classes is that they improve the decoupling of our codebase, allowing huge subsystems to be replaced, and as long as the Interface Class isadhered to, we will have greater confidence that it will continue to function as intended.

Next, we'll define a custom MonoBehaviour type which implements this Interface Class:

public class UpdateableComponent : MonoBehaviour, IUpdateable {
public virtual void OnUpdate(float dt) {}
}

Note that we're naming the method OnUpdate() rather than Update().

We're defining a custom version of the same concept, but we want to avoid name collisions with the built-in Update() callback.

The OnUpdate() method of the UpdateableComponent class retrieves the current delta time (dt), which spares us from a bunch of unnecessary Time.deltaTime calls, which are commonly used in Update() callbacks.

We've also made the function virtual to allow derived classes to customize it.

This function will never be called as it's currently being written.

Unity automatically grabs and invokes methods defined with the Update() name, but has no concept of our OnUpdate() function, so we will need to implement something that will call this method when the time is appropriate.

For example, some kind of GameLogic God Class could be used for this purpose.

During the initialization of this Component, we should do something to notify our GameLogic object of both its existence and its destruction so that it knows when to start and stop calling its OnUpdate() function.

In the following example, we will assume that our GameLogic class is a SingletonComponent, as defined earlier in the "Singleton Components" section, and has appropriate static functions defined for registration and deregistration.

Bear in mind that it could just as easily use the aforementioned MessagingSystem to notify the GameLogic of its creation/destruction.

For MonoBehaviours to hook into this system, the most appropriate place is within their Start() and OnDestroy() callbacks:

void Start() {

GameLogic.Instance.RegisterUpdateableObject(this);
}

void OnDestroy() {
if (GameLogic.Instance.IsAlive) {
GameLogic.Instance.DeregisterUpdateableObject(this);
}
}

It is best to use the Start() method for the task of registration, since using Start() means that we can be certain all other pre-existing Components will have at least had their Awake() methods called prior to this moment.

This way, any critical initialization work will have already been done on the object before we start invoking updates on it.

Note that because we're using Start() in a MonoBehaviour base class, if we define a Start() method in a derived class, it will effectively override the base class definition, and Unity will grab the derived Start() method as a callback instead.

It would, therefore, be wise to implement a virtual Initialize() method so that derived classes can override it to customize initialization behavior without interfering with the base class's task of notifying the GameLogic object of our Component's existence.

The following code provides an example of how we might implement a virtual Initialize() method.

void Start() {
GameLogic.Instance.RegisterUpdateableObject(this);
Initialize();
}

protected virtual void Initialize() {
// derived classes should override this method for initialization code, and NOT reimple
}

Finally, we will need to implement the GameLogic class.

The implementation is effectively the same whether it is a SingletonComponent or a MonoBehaviour, and whether or not it uses the MessagingSystem.

Either way, our UpdateableComponent class must register and deregister as IUpdateable objects, and the GameLogic class must use its own Update() callback to iterate through every registered object and call their OnUpdate() function.

Here is the definition for our GameLogic class:

public class GameLogicSingletonComponent : SingletonComponent<GameLogicSingletonComponent> {
public static GameLogicSingletonComponent Instance {
get { return ((GameLogicSingletonComponent)_Instance); }
set { _Instance = value; }
}

List<IUpdateable> _updateableObjects = new List<IUpdateable>();

public void RegisterUpdateableObject(IUpdateable obj) {
if (!_updateableObjects.Contains(obj)) {
_updateableObjects.Add(obj);
}
}

public void DeregisterUpdateableObject(IUpdateable obj) {
if (_updateableObjects.Contains(obj)) {
_updateableObjects.Remove(obj);
}
}

void Update() {
float dt = Time.deltaTime;
for (int i = 0; i < _updateableObjects.Count; ++i) {
_updateableObjects[i].OnUpdate(dt);
}
}
}

If we make sure that all of our custom Components inherit from the UpdateableComponent class, then we've effectively replaced "N" invocations of the Update() callback with just one Update() callback, plus "N" virtual function calls.

This can save us a large amount of performance overhead because even though we're calling virtual functions (which cost a small overhead more than a non-virtual function call because it needs to redirect the call to the correct place), we're still keeping the overwhelming majority of update behavior inside our Managed code and avoiding the Native-Managed Bridge as much as possible.

This class can even be expanded to provide priority systems, to skip low-priority tasks if it detects that the current frame has taken too long, and many other possibilities.

Depending on how deep you already are into your current project, such changes can be incredibly daunting, time-consuming, and likely to introduce a lot of bugs as subsystems are updated to make use of a completely different set of dependencies.

However, the benefits can outweigh the risks if time is on your side.

It would be wise to do some testing on a group of objects in a Scene that is similarly designed to your current Scene files to verify that thebenefits outweigh the costs.

　　Summary

3. The Benefits of Batching

In 3D graphics and games, batching is a very general term describing the process of grouping a large number of wayward pieces of data together and processing them as a single, large block of data.

This situation is ideal for CPUs, and particularly GPUs, which can handle simultaneous processing of multiple tasks with their multiple cores.

Having a single core switching back and forth between different locations in memory takes time, so the less this needs to be done, the better.

In some cases, the act of batching refers to large sets of meshes, vertices, edges, UV coordinates, and other different data types that are used to represent a 3D object.

However, the term could just as easily refer to the act of batching audio files, sprites, texture files, and other large datasets.

So, just to clear up any confusion, when the topic of batching is mentioned in Unity, it is usually referring to the two primary mechanisms it offers for batching mesh data: Dynamic Batching and Static Batching.

These methods are essentially two different forms of geometry merging, where we combine mesh data of multiple objects together and render them all in a single instruction, as opposed to preparing and drawing each one separately.

The process of batching together multiple meshes into a single mesh is possible because there is no reason a mesh object must fill a contiguous volume of 3D space.

The Rendering Pipeline is perfectly happy with accepting a collection of vertices that are not attached together with edges, and so we can take multiple separate meshes that might have resulted in multiple render instructions and combine them together into a single mesh, thus rendering it out using a single instruction.

There has been a lot of confusion over the years surrounding the conditions under which the Dynamic Batching and Static Batching systems activate and where we might even see a performance improvement.

After all, in some cases, batching can actually degrade performance if it is not used wisely.

A proper understanding of these systems will give us the knowledge we need to improve the graphics performance of our application in significant ways.

This chapter intends to dispel much of the misinformation floating around about these systems.

We will observe, via explanation, exploration, and examples, just how these two batching methods operate.

This will enable us to make informed decisions, using most of them to improve our application's performance.

We will cover the following topics in this chapter:

A brief introduction to the Rendering Pipeline and the concept of Draw Calls
How Unity's Materials and Shaders work together to render our objects
Using the Frame Debugger to visualize rendering behavior
How Dynamic Batching works, and how to optimize it
How Static Batching works, and how to optimize it

　　Draw Calls

Before we discuss Dynamic Batching and Static Batching independently, let's first understand the problems that they are both trying to solve within the Rendering Pipeline.

We will try to keep fairly light on the technicalities as we will explore this topic in greater detail in Chapter 6, Dynamic Graphics.

The primary goal of these batching methods is to reduce the number of Draw Calls required to render all objects in the current view.

At its most basic form, a Draw Call is a request sent from the CPU to the GPU asking it to draw an object.

Draw Call is the common industry vernacular for this process, although they are sometimes referred to as SetPass Calls in Unity, since some low-level methods are named as such.

Think of it as configuring options before initiating the current rendering pass.

We will refer to them as Draw Calls throughout the remainder of this book.

Before a Draw Call can be requested, several tasks need to be completed.

Firstly, mesh and texture data must be pushed from the CPU memory (RAM) into GPU memory (VRAM), which typically takes place during initialization of the Scene, but only for textures and meshes the Scene file knows about.

If we dynamically instantiate objects at runtime using texture and mesh data that hasn't appeared in the Scene yet, then they must be loaded at the time they are instantiated.

The Scene cannot know ahead of time which Prefabs we're planning to instantiate at runtime, as many of them are hidden behind conditional statements and much of our application's behavior depends upon user input.

Next, the CPU must prepare the GPU by configuring the options and rendering features that are needed to process the object that is the target of the Draw Call.

These communication tasks between the CPU and GPU take place through the underlying Graphics API, which could be DirectX, OpenGL, OpenGLES, Metal, WebGL, or Vulkan, depending on the platform we're targeting and certain graphical settings.

These API calls go through a library, called a driver, which maintains a long series of complex and interrelated settings, state variables, and datasets that can be configured and executed from our application (although drivers are designed to service multiple applications
simultaneously, as well as render calls coming from multiple threads).

The available features change enormously based on the graphics card we're using and the version of the Graphics API we're targeting; more advanced graphics cards support more advanced features, which would need to be supported by newer versions of the API, so updated drivers would be needed to enable them.

The sheer number of settings, supported features, and compatibility levels between one version and another that have been created over the years (particularly for the older APIs such as DirectX and OpenGL) can be nothing short of mind-boggling.

Thankfully, at a certain level of abstraction, all of these APIs tend to operate in a similar fashion; hence Unity is able to support many different Graphics APIs through a common interface.

This utterly massive array of settings that must be configured to prepare the Rendering Pipeline just prior to rendering an object is often condensed into a single term known as the Render State.

Until these Render State options are changed, the GPU will maintain the same Render State for all incoming objects and render them in a similar fashion.

Changing the Render State can be a time-consuming process.

So, for example, if we were to set the Render State to use a blue texture file and then ask it to render one gigantic mesh, then it would be rendered very rapidly with the whole mesh appearing blue.

We could then render 9 more, completely different meshes, and they would all be rendered blue, since we haven't changed which texture is being used.

If, however, we wanted to render 10 meshes using 10 different textures, then this will take longer.

This is because we will need to prepare the Render State with the new texture just prior to sending the Draw Call instruction for each mesh.

The texture being used to render the current object is effectively a global variable in the Graphics API, and changing a global variable within a parallel system is much easier said than done.

In a massively parallel system such as a GPU, we must effectively wait until all of the current jobs have reached the same synchronization point (in other words, the fastest cores need to stop and wait for the slowest ones to catch up, wasting processing time that they could be using on other tasks) before we can make a Render State change, at which point we will need to spin up all of the parallel jobs again.

This can waste a lot of time, so the less we need to ask the Render State to change, the faster the Graphics API will be able to process our requests.

Things that can trigger Render State synchronization include--but are not limited to--an immediate push of a new texture to the GPU and changing a Shader, lighting information, shadows, transparency, and pretty much any graphical setting we can think of.

Once the Render State is configured, the CPU must decide what mesh to draw, what textures and Shader it should use, and where to draw the object based on its position, rotation, and scale (all represented within a 4x4 matrix known as a transform, which is where the Transform Component gets its name from) and then send an instruction to the GPU to draw it.

In order to keep the communication between CPU and GPU very dynamic, new instructions are pushed into a queue known as the Command Buffer.

This queue contains instructions that the CPU has created and that the GPU pulls from each time it finishes the preceding command.

The trick to how batching improves the performance of this process is that a new Draw Call does not necessarily mean that a new Render State must be configured.

If two objects share the exact same Render State information, then the GPU can immediately begin rendering the new object since the same Render State is maintained after the last object is finished.

This eliminates the time wasted due to a Render State synchronization.

It also serves to reduce the number of instructions that need to be pushed into the Command Buffer, reducing the workload on both the CPU and GPU.
　　Materials and Shaders

　　The Frame Debugger

　　Dynamic Batching

　　　　Vertex attributes

　　　　Mesh scaling

　　　　Dynamic Batching summary

　　Static Batching

　　The Static flag

　　Memory requirements

　　Material references

　　Static Batching caveats

　　　　Edit Mode debugging of Static Batching

　　　　Instantiating static meshes at runtime

　　Static Batching summary

　　Summary

4. Kickstart Your Art

　　Audio

　　　　Importing audio files

　　　　Loading audio files

　　　　Encoding formats and quality levels

　　　　Audio performance enhancements

　　　　　　Minimize active Audio Source count

　　　　　　Enable Force to Mono for 3D sounds

　　　　　　Resample to lower frequencies

　　　　　　Consider all compression formats

　　　　　　Beware of streaming

　　　　　　Apply Filter Effects through Mixer Groups to reduce duplication

　　　　　　Use remote content streaming responsibly

　　　　　　Consider Audio Module files for background music

　　Texture files

The terms texture and sprite often get confused in game development, so it's worth making the distinction--a texture is simply an image file, a big list of color data telling the interpreting program what color each pixel of the image should be, whereas a sprite is the 2D equivalent of a mesh, which is often just a single quad(a pair of triangles combined to make a rectangular mesh) that renders flat against the current Camera.

There are also things called Sprite Sheets, which are large collections of individual images contained within a larger texture file, commonly used to contain the animations of a 2D character.

These files can be split apart by tools, such as Unity's Sprite Atlas tool, to form individual textures for the character's animated frames.

Both meshes and sprites use textures to render an image onto its surface.

Texture image files are typically generated in tools such as Adobe Photoshop or Gimp and then imported into our project in much the same way as audio files.

At runtime, these files are loaded into memory, pushed to the GPU's VRAM, and rendered by a Shader over the target sprite or mesh during a given Draw Call.

　　　　Texture compression formats

　　　　Texture performance enhancements

　　　　　　Reduce texture file size

　　　　　　Use Mip Maps wisely

　　　　　　Manage resolution downscaling externally

　　　　　　Adjust Anisotropic Filtering levels

　　　　　　Consider Atlasing

　　　　　　Adjust compression rates for non-square textures

　　　　　　Sparse Textures

　　　　　　Procedural Materials

　　　　　　Asynchronous Texture Uploading

　　Mesh and animation files

　　　　Reduce polygon count

　　　　Tweak Mesh Compression

　　　　Use Read-Write Enabled appropriately

　　　　Consider baked animations

　　　　Combine meshes

　　Asset Bundles and Resources

　　Summary

5. Faster Physics

In this chapter, we will cover the following areas:

Understanding how Unity's Physics Engine works:

Timesteps and FixedUpdates
Collider types
Collisions
Raycasting
Rigidbody active states

Physics performance optimizations:

How to structure Scenes for optimal physics behavior
Using the most appropriate types of Collider
Optimizing the Collision Matrix
Improving physics consistency and avoiding error-prone behavior
Ragdolls and other Joint-based objects

　　Physics Engine internals

　　　　Physics and time

　　　　　　Maximum Allowed Timestep

It is important to note that if a lot of time has passed since the last Fixed Update (for example, the game froze momentarily), then Fixed Updates will continue to be calculated within the same Fixed Update loop until the Physics Engine has caught up with the current time.

For example, if the previous frame took 100 ms to render (for example, a sudden CPU spike caused the main thread to block for a long time), then the Physics Engine will need to be updated five times.

The FixedUpdate() method will, therefore, be called five times before Update() can be called again due to the default Fixed Update Timestep of 20 milliseconds.

Of course, if there is a lot of physics activity to process during these five Fixed Updates, such that it takes more than 20 milliseconds to resolve them all, then the Physics Engine will need to invoke a sixth update.

Consequently, it's possible during moments of heavy physics activity that the Physics Engine takes more time to process a Fixed Update than the amount of time it is simulating.

For example, if it took 30 ms to process a Fixed Update simulating 20 ms of Gameplay, then it has fallen behind, requiring it to process more Timesteps to try and keep up, but this could cause it to fall behind even further, requiring it to process even more Timesteps, and so on.

In these situations the Physics Engine is never able to escape the Fixed Update loop and allow another frame to render.

This problem is often known as the spiral of death.

However, to prevent the Physics Engine from locking up our game during these moments, there is a maximum amount of time that the Physics Engine is allowed to process each Fixed Update loop.

This threshold is called the Maximum Allowed Timestep, and if the current batch of Fixed Updates takes too long to process, then it will simply stop and forgo further processing until the next render update completes.

This design allows the Rendering Pipeline to at least render the current state and allow for user input and gameplay logic to make some decisions during rare moments where the Physics Engine has gone ballistic (pun intended).

This setting can be accessed through Edit | Project Settings | Time | Maximum Allowed Timestep

　　　　　　Physics updates and runtime changes

When the Physics Engine processes a given timestep, it must move any active Rigidbody objects(GameObjects with a Rigidbody Component), detect any new collisions, and invoke the collision callbacks on the corresponding objects.

The Unity documentation makes an explicit note that changes to Rigidbody objects should be handled within FixedUpdate() and other physics callbacks for exactly this reason.

These methods are tightly coupled with the update frequency of the Physics Engine as opposed to other parts of the Game Loop, such as Update().

This means that callbacks such as FixedUpdate() and OnTriggerEnter() are safe places to make Rigidbody changes, whereas methods such as Update() and Coroutines yielding on WaitForSeconds or WaitForEndOfFrame are not.

Ignoring this advice could cause unexpected physics behavior, as multiple changes may be made to the same object before the Physics Engine is given a chance to catch and process all of them.

It's particularly dangerous to apply forces or impulses to objects in Update() callbacks without taking into account the frequency of those calls.

For instance, applying a 10-Newton force each Update while the player holds down a key would result in completely different resultant velocity between two different devices than if we did the same thing in Fixed Update since we can't rely on the number of Update() calls being consistent.

However, doing so in a FixedUpdate() callback will be much more consistent.

Therefore, we must ensure that all physics-related behavior is handled in the appropriate callbacks or we will risk introducing some especially confusing gameplay bugs that are very hard to reproduce.

It logically follows that the more time we spend in any given Fixed Update iteration, the less time we have for the next gameplay and rendering pass.

Most of the time this results in minor, unnoticeable background processing tasks, since the Physics Engine barely has any work to do, andthe FixedUpdate() callbacks have a lot of time to complete their work.

However, in some games, the Physics Engine could be performing a lot of calculations during each and every Fixed Update.

This kind of bottlenecking in physics processing time will affect our frame rate, causing it to plummet as the Physics Engine is tasked with greater and greater workloads.

Essentially, the Rendering Pipeline will try to proceed as normal, but whenever it's time for a Fixed Update, in which the Physics Engine takes a long time to process, the Rendering Pipeline would have very little time to generate the current display before the frame is due, causing a sudden stutter.

This is in addition to the visual effect of the Physics Engine stopping early because it hit the Maximum Allowed Timestep.

All of this together would generate a very poor user experience.

Hence, in order to keep a smooth and consistent frame rate, we will need to free up as much time as we can for rendering by minimizing the amount of time the Physics Engine takes to process any given timestep.

This applies in both the best-case scenario (nothing moving) and worst-case scenario (everything smashing into everything else at once).

There are a number of time-related features and values we can tweak within the Physics Engine to avoid performance pitfalls such as these.

　　　　Static Colliders and Dynamic Colliders

Dynamic Colliders simply mean GameObjects that contain both a Collider Component (which could be one of several types) and a Rigidbody Component.

We can also have Colliders that do not have a Rigidbody Component attached, and these are called Static Colliders.

　　　　Collision detection

　　　　Collider types

　　　　The Collision Matrix

The Collision Matrix can be accessed through Edit | Project Settings | (Physics / Physics2D) | Layer Collision Matrix.
　　　　Rigidbody active and sleeping states

Every modern Physics Engine shares a common optimization technique, whereby objects that have come to rest have their internal state changed from an active state to a sleeping state.

The threshold value that controls the sleeping state can be modified under Edit | Project Settings | Physics | Sleep Threshold.

　　　　Ray and object casting

Another common feature of Physics Engines is the ability to cast a ray from one point to another and generate collision information with one or more of the objects in its path.

This is known as Raycasting.It is pretty common to implement several gameplay mechanics through Raycasting, such as firing a gun.

This is typically implemented by performing Raycasts from the player to the target location and finding any viable targets in its path (even if it's just a wall).

We can also obtain a list of targets within a finite distance of a fixed point in space using a Physics.OverlapSphere() check.

This is typically used to implement area-of-effect gameplay features, such as grenade or fireball explosions.

We can even cast entire objects forward in space using Physics.SphereCast() and Physics.CapsuleCast().

These methods are often used to simulate wide laser beams, or if we simply want to see what would be in the path of a moving character.

　　　　Debugging Physics

　　Physics performance optimizations

　　　　Scene setup

　　　　　　Scaling

　　　　　　Positioning

　　　　　　Mass

　　　　Use Static Colliders appropriately

　　　　Use Trigger Volumes responsibly

　　　　Optimize the Collision Matrix

　　　　Prefer Discrete collision detection

　　　　Modify the Fixed Update frequency

　　　　Adjust the Maximum Allowed Timestep

　　　　Minimize Raycasting and bounding-volume checks

　　　　Avoid complex Mesh Colliders

　　　　　　Use simpler primitives

　　　　　　Use simpler Mesh Colliders

　　　　Avoid complex physics Components

　　　　Let physics objects sleep

　　　　Modify the Solver Iteration Count

　　　　Optimize Ragdolls

　　　　　　Reduce Joints and Colliders

　　　　　　Avoid inter-Ragdoll collisions

　　　　　　Replace, deactivate or remove inactive Ragdolls

　　　　Know when to use physics

　　Summary

6. Dynamic Graphics

　　The Rendering Pipeline

　　The GPU Front End

　　The GPU Back End

　　Fill Rate

　　Overdraw

　　Memory Bandwidth

　　Lighting and Shadowing

　　Forward Rendering

　　Deferred RenderingVertex Lit Shading (legacy)

　　Global Illumination

　　Multithreaded Rendering

　　Low-level rendering APIs

　　Detecting performance issues

　　Profiling rendering issues

　　Brute-force testing

　　Rendering performance enhancements

　　Enable/Disable GPU Skinning

　　Reduce geometric complexity

　　Reduce Tessellation

　　Employ GPU Instancing

　　Use mesh-based Level Of Detail (LOD)

　　Culling Groups

　　Make use of Occlusion Culling

　　Optimizing Particle Systems

　　Make use of Particle System Culling

　　Avoid recursive Particle System calls

　　Optimizing Unity UI

　　Use more Canvases

　　Separate objects between static and dynamic canvases

　　Disable Raycast Target for noninteractive elements

　　Hide UI elements by disabling the parent Canvas Component

　　Avoid Animator Components

　　Explicitly define the Event Camera for World Space Canvases

　　Don't use alpha to hide UI elements

　　Optimizing ScrollRects

　　Make sure to use a RectMask2D

　　Disable Pixel Perfect for ScrollRects

　　Manually stop ScrollRect motion

　　Use empty UIText elements for full-screen interaction

　　Check the Unity UI source code

　　Check the documentation

　　Shader optimization

　　Consider using Shaders intended for mobile platforms

　　Use small data types

　　Avoid changing precision while swizzling

　　Use GPU-optimized helper functions

　　Disable unnecessary features

　　Remove unnecessary input dataExpose only necessary variables

　　Reduce mathematical complexity

　　Reduce texture sampling

　　Avoid conditional statements

　　Reduce data dependencies

　　Surface Shaders

　　Use Shader-based LOD

　　Use less texture data

　　Test different GPU Texture Compression formats

　　Minimize texture swapping

　　VRAM limits

　　Preload textures with hidden GameObjects

　　Avoid texture thrashing

　　Lighting optimization

　　Use real-time Shadows responsibly

　　Use Culling Masks

　　Use baked Lightmaps

　　Optimizing rendering performance for mobile devices

　　Avoid Alpha Testing

　　Minimize Draw Calls

　　Minimize Material count

　　Minimize texture size

　　Make textures square and power-of-two

　　Use the lowest possible precision formats in Shaders

　　Summary

7. Virtual Velocity and Augmented Acceleration

　　XR Development

　　Emulation

　　User comfort

　　Performance enhancements

　　The kitchen sink

　　Single-Pass versus Multi-Pass Stereo Rendering

　　Apply anti-aliasing

　　Prefer Forward Rendering

　　Image effects in VR

　　Backface culling

　　Spatialized audio

　　Avoid camera physics collisions

　　Avoid Euler anglesExercise restraint

　　Keep up to date with the latest developments

　　Summary

8. Masterful Memory Management

　　The Mono platform

　　Memory Domains

　　Garbage collection

　　Memory Fragmentation

　　Garbage collection at runtime

　　Threaded garbage collection

　　Code compilation

　　IL2CPP

　　Profiling memory

　　Profiling memory consumption

　　Profiling memory efficiency

　　Memory management performance enhancements

　　Garbage collection tactics

　　Manual JIT compilation

　　Value types and Reference types

　　Pass by value and by reference

　　Structs are Value types

　　Arrays are Reference types

　　Strings are immutable Reference types

　　String concatenation

　　StringBuilder

　　String formatting

　　Boxing

　　The importance of data layout

　　Arrays from the Unity API

　　Using InstanceIDs for dictionary keys

　　foreach loops

　　Coroutines

　　Closures

　　The .NET library functions

　　Temporary work buffers

　　Object Pooling

　　Prefab Pooling

　　Poolable Components

　　The Prefab Pooling System

　　Prefab poolsObject spawning

　　Instance prespawning

　　Object despawning

　　Prefab pool testing

　　Prefab Pooling and Scene loading

　　Prefab Pooling summary

　　IL2CPP optimizations

　　WebGL optimizations

　　The future of Unity, Mono, and IL2CPP

　　The upcoming C# Job System

　　Summary

9. Tactical Tips and Tricks

　　Editor hotkey tips

　　GameObjects

　　Scene window

　　Arrays

　　Interface

　　In-editor documentation

　　Editor UI tips

　　Script Execution Order

　　Editor files

　　The Inspector window

　　The Project window

　　The Hierarchy window

　　The Scene and Game windows

　　Play Mode

　　Scripting tips

　　General

　　Attributes

　　Variable attributes

　　Class attributes

　　Logging

　　Useful links

　　Custom Editor scripts and menu tips

　　External tips

　　Other tips

　　Summary

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。