Hi Jason!
I am Jose Antonio, I wrote you a couple of years ago to ask you some
advice about how I could be a better professional. Your advice were
really useful for me in the past and now I am having a lead programmer
position in AiGameDev.com.
I would like improve my skills as lead programmer, which I think that
takes more that the simple technical skills. I would really appreciate
if you give me any advice to improve my role as lead programmer by
telling me what you consider a lead programmer should do.
I really would be also interested in listen something about how I can
communicate better with my team. I would like to know also if you have
any recommended reading for this purpose or any book you think it can
help me.
My thanks in advance and my best wishes for your next awesome project ;)
Cheers!
-----------------------------------------
Hi Jose,
Sorry I didn't get back to you sooner. I understand the difficulties in
making the transition from the role of engineer to that of technical
leader. Here's a great article which sums up most of what I've learned
about being a good tech. lead. And I'm still learning! :)
http://www.mroodles.com/wordpress/hacking/great-mistakes-in-technical-leadership-reprint/
At Naughty Dog, our approach is to keep the interpersonal relationships
open, honest, respectful, and always focused on solving the problems
that we need to solve in order to ship a great game (rather than getting
side-tracked by irrelevant issues). That means trying our best to
check our egos at the door. We always aim to criticize IDEAS, never
people. To make sure everyone on the team feels like they have a voice,
and to communicate clearly that great ideas can come from anywhere
within the team (not just "the top"). We encourage everyone on the team
to take a leadership role -- to "own" the tasks for which they are
responsible, and to act like "producers," making sure that their tasks
are completed to a high level of quality every time.
I look at my role as a technical leader in much the same way that a lead
alto sax player fits into a jazz band... he or she is playing in the
band along with everyone else, but he or she also serves as a sort of
guidepost, to help keep the entire band playing in time and on cue.
That's really it -- set a good example, help to make sure everyone is
communicating, remove any road blocks, and then get out of the way and
trust your team to get the job done.
I hope this helps. At the end of the day, the only way to learn this stuff is to try (and fail many times!)
All the best!
J
Selected email conversations, errata and announcements regarding "Game Engine Architecture" by Jason Gregory
Sunday, February 12, 2012
Thursday, January 5, 2012
Advice for a game engine newcomer
From: Mike Breske
I am currently working as a consultant, doing a lot of work with ASP.NET web applications and business applications built with .NET C#. I'm a relatively new college graduate (applied mathematics degree), and I'm quickly learning that my passions lie outside the world business related applications. I love developing solutions using computer science and mathematics, but the domain isn't right for me.
A friend of mine shares similar sentiments, and we're both looking to work our way into the gaming industry (how many times have you heard that line?). It's my understanding that is not an easy task, as experience is the key. Our plan is to do what we can outside of work and start building games, as bad as they may turn out at first. We want as much experience as possible with the various aspects of the game development process.
To that end, we want to build our own engine. I plan on purchasing your Game Engine Architecture textbook as a core resource. Do you think building a very basic engine from the ground up, then a game using that engine, is a reasonable task for a team of two guys to handle? Do you recommend a different approach or have any other advice? I obviously don't expect you to take a keen interest in my career or anything, but it's nice to be able to solicit guidance from someone who's in a place you want to be.
I saw that you worked on the Uncharted series and I had to see if I could get in contact with you, even for some small tidbits of knowledge. Uncharted 2 changed how I view what's possible in a game, the production value was incredible. I have some serious respect for the team that put that game together.
Thank you for your time!
Mike Breske
________________________
Hi Mike,
An insightful analysis! You are quite right that breaking into the game industry requires some significant effort. And the best way to prepare for entry into the industry is to develop some game software on your own.
That said, embarking on building a game engine from the ground up is a monumental task, and may not serve your immediate purposes. I would recommend instead that you get a hold of a pre-existing engine (e.g. Unreal, Half Life Source, Cryengine, Microsoft XNA, etc.) and "mod" it to make a game. That will give you exposure to a full-fledged engine, and allow you to build something that you can show off, in much less time than it would take to build an engine and then build a game. (Companies with 50 engineers have tried this and failed, so 2 engineers will probably find it a little tricky to pull off!) And building an engine without having seen one is a bit like trying to invent a new automobile without first learning how to change the oil on your existing car.
That said, I do suggest you read my book, as another way to get the "big picture" of how a game engine typically works. That combined with some experience with an existing engine should give you a solid foundation and put you ahead of your peers. Then if you have a demo or two, you should be golden.
You needn't make a full game, by the way. Try to figure out your passion, and make a demo to explore that. You might want to demonstrate your abilities at level design -- in which case you could make a couple of way-cool multiplayer arenas. Or you might want to explore 3D graphics -- in which case you might try using Ogre3D, or raw OpenGL or DirectX, to create a rendering demo with shadows, spherical harmonics, subsurface scattering -- or whatever technology seems interesting. The list goes on. Your demo doesn't have to be huge, either. It can be small and focused, as long as it demonstrates the ability to take on a non-trivial task, solve some difficult problems, and see it through to completion. That's what most employers are looking for -- someone who can not only start the job, but finish it and do it well.
Also, I can't stress enough the importance of practicing your 3D vector math. Knowing how to think in two and three dimensions, working equations in terms of vector notation (rather than always breaking things into x,y,z components), intuitively understanding that a dot product represents a projection, being comfortable with matrices as transformations of either points and vectors or (the inverse) coordinate axes -- these are also things game employers look for. Naughty Dog tests 3D math first, and if you fail that they don't even ask you about your software engineering skills.
Best of luck! Please keep me posted on how things go.
J
Errata/Suggestions from Chinese translator
From: Milo Yip
Hi Jason,
I am glad to tell you that, since last year, I have been translating
your book into Chinese for publishing in China. Actually this is my
first time to work on translation work.
I will send you some issues I found in these few days. And I hope I
can ask you questions when I encounter difficulties in your book.
Thanks.
____________
Hi Jason,
I have only translated till chapter 5 at the mean time. So most
suggestions in the following are in the first few chapters. Feel free
to further discuss.
Chapter 1
P.18 "... called a Brawler. This kind of fighting game can have
technical requirements more akin to those of a first-person shooter or
..."
I think "third-person games" is even more similar.
P.19 "Cruisin' USA"
It should be "Cruis'n USA"
P.23 AOL's "Neverwinter Nights" is MMORPG but Bioware's "Neverwinter
Nights" is not. It may be confusing.
P.24 "Populus"
It should be "Populous"
P.28 "OGRE 3D is a ..."
The official name is OGRE, no 3D. There are many occurance of "OGRE
3D" in the book.
P.28 "Torque" is not one of open-source engines. It should be
categorized as other commercial engine.
P.29 I think "3rd party SDKs" may be difficult to fit in the diagram,
since they can be anywhere in the architecture. For example, Physics
engiine were shown in the 3rd party SDKs but they are also in
collision & Physics box.
P.32 "... hardware transform and lighting (hardware T & L) which began
with DirectX 8.
It should be "DirectX 7".
P.39 "high dynamic range (HDR) lighting and bloom"
More precisely, only HDR tone mapping is a post-process effect. HDR
lighting occurs in the lighting process.
P.43 "WiiMote"
Nintendo's official term is "Wii Remote"
Chapter 2
P.71 "_GNUC_"
It should be "__GNUC__". It may also mention about "__GNUC_MINOR__".
P. 74 "Of these, the three we will use most are"
the four
P. 74 "General Tab"
Officially should be "General Property Page". Also the same in other
sections as well.
P.74 "Configuration Propertoes/C++"
It should be "C/C++"
P. 74 "Output directory ... that the compiler/linker ultimately outputs"
Strictly speaking should be linker only.
P. 76 "$VCInstallDir). The directory in which Visual Studio's standard
C library is currently installed.
I think it should be pointed to the Visual C++ installation directory.
And there are tools, headers and binaries for VC under that directory.
P.76 "General Tab/Include Directories"
"Additional Include Directories"
P.76 "General Tab/Debug Information"
"Debug Information Format"
P.79 "However, you cannot debug more than one program at a time."
This is not true. http://msdn.microsoft.com/en- us/library/a404w14b(v=VS.90). aspx
P.79 "Hitting F5 ... will run the .exe built by the start-up project"
This may not be precise. As VC runs the "Command" property in the
Debugging Property Page of the start-up project, and the default value
is $(RemotePath). But this can be any executable or dump file.
P.79 "Break point"
I think the term should be a single word "breakpoint".
http://en.wikipedia.org/wiki/ Breakpoint
P.81 "You can cast variables from one type to another... For example,
... as a floating-point value."
I think this example is not good because without the explicit cast
(float), C/C++ will implicitly cast myIntegerVariable to float for the
multiplication with another float. My suggestion is "(float)a/b" to
show the floating point ratio between two integer variables.
P.81 "...inspect the rotation angle of any quaternion from within the debugger."
Is "from" an extra word?
P.99 "To represent a signed integer in 32 bits ... "
This paragraph may mention the term "sign-and-magnitude method".
Chapter 3
P.106 "This is because games are usually developed on a PC or Linux."
Should be better "Windows on Linux" or just "PC".
P.119 "...and later returned to the pool for use by other programs by
calling free()."
"Other programs" seems inaccurate for most OS today. malloc() and
free() are concerning the heap of a process. They should not affect
"other programs".
P.119 "In C++, the global new and delete ... to and from the heap."
More precisely, C++ terminology use free store instead of heap.
http://www.gotw.ca/gotw/009. htm
P.132 "Structured exception handling (SEH) is a very powerful feature of C++".
SEH is a feature of Win32. C++ standard uses the term "exception handling".
Chapter 4
P. 162 the final position of the jet's left wingtip in model space is ..."
It should be "world" space.
Chapter 5
P.210 "On the PS3, should be 128-bit aligned for maximum DMA
throughput, meaning they can only end in the bytes 0x00 or 0x80."
If the address can only end in the bytes 0x00 or 0x80, then it should
be 128-"byte" aligned.
P.213 "...support single- and double-buffered allocators."
The title and the other text use "single-frame allocator". I suggest
to use the consistent term "single-frame allocator".
P.223 "That is, the linker never splits up a compiled translation unit
(.obj file) ..."
In VC2008, it supports function level linking, which can splits up a
compiled translation unit.
P.223 So, following ... to avoid D-cache misses"
It should be "I-cache".
P.224 "Priority queue. ... A priority queue is typically implemented
as a binary search tree (e.g. std::priority_queue).
Priority queue is more often implemented as heap data structure.
http://en.wikipedia.org/wiki/ Priority_queue#Usual_ implementation
It should not be "thought of a list that stays sorted at all times",
because it only permits dequeue for the maximum/minimum element, but
not traversing the content.
P.227 "If a divided-and-conquer approach is used, as in a binary
search, ..., only log_2 n elements will actually be visited by the
algorithm on average..."
It should be "on the worst case" instead of "on average".
P.248 "The UTF-16 standard ... Each character takes up exactly 16 bits."
"UTF-16 ... produces a variable-length result of either one or two
16-bit code units per code point." http://en.wikipedia.org/wiki/ UTF-16
P.249 "Under Microsoft Windows, the data type wchar_t is used to
represent a single "wide" UTF-16 character (WCS)"
There are several issues in this sentence. First wchar_t is a standard
C++ type, it is unrelated to Windows (OS). Second, the character set
used in wchar_t is undefined in C++.
http://en.wikipedia.org/wiki/ Wide_character
Chapter 10
P. 421 "... DXT ... the basic idea is to break the texture into 2x2
blocks of pixels"
It should be 4x4
P.427 "The ambient term ... is a gross approximation of the amount of
indirect bounced light present in the scene"
Ambient light may not be only indirect lights but the light arrive at
the surface at almost all directions. It is common to use ambient term
to approximate skylight.
P. 428
k_A, k_D, k_S normally should be a vector (RGB color), and use bold
face as C_i.
P.442 "The basic idea is to break a triangle ... MSAA does not require
a doububle-width frame buffer."
The section is not very precise. It only need to run fragment shader
on multiple subsample fragments at the edge of polygons. Pixels that
are completely inside a polygon only run fragment shader once per
pixel.
Also, it does not mention that MSAA require 4x (or more according to
the level of MSAA) memory for depth and stencil buffer.
P.443 "The depth buffer ... typically contains 16- or 24-bit floating
point depth information..."
Typically depth buffer is in integer format. Direct3D 9 has only
32-bit floating point depth buffer, others are integer formats.
P.443 "When a fragment's color is written into the frame buffer, it depth..."
"its depth"
P.447 "Both FxComposer and Unreal Engine 3 provide powerful graphical
shading languages."
FxComposer doesn't. It only provides traditional text-based shading language.
P. 461 "For each frustum plane, we move the plane inward a distance
equal to the radius, ... inside the frustum"
It should move the plane "outward" by the distance equal to the
radius. If all inward tests are true, then the sphere "may be" inside
the frustum.
For details, see "Real-time Rendering 3rd Edition". Actually only need
to evaluate the plane equation of the center of sphere and then
compare to r for culling purpose.
P.480 "In traditional triangle-rasterization-based rendering, all
lighting and shading calculations are performed on the triangle
fragments in view space."
More precisely, lighting is often performed in world, view or tangent space.
P. 481 "A typical G-buffer might contain ... depth, surface normal in
clip space"
I think more often they are stored in view or world space. Normal in
clip space are difficult to do lighting computation.
P. 482 "... including vignette (slight blur around the edge of screen)..."
Vignette often means reduction of brightness and saturation at the
edge of screen.
P.484 sky rendering
I think, if the sky is rendered before other objects, then z-test and
z-write should be turned off to reduce bandwidth.
But for current gen hardware, I think sky should be better rendered
last with z-test on and z-write off, so it reduce the pixel shading
overhead at the occluded sky area.
Chapter 11
P. 523 the 2nd equation
w_ij should be w_i, and K_i should be K_j_i.
Chapter 12
P630. "... known as mechanics. This is the study of how forces affect
the behavior of objects. In a game engine, we are particularly
concerned with dynamics of objects -- how they move over time"
There are several issues. Mechanics is a big branch in physics, its
subfields include kinematics, dynamics, quantum mechanics, etc.
Second, kinematics is study of motion of objects (without concerning
about force), while dynamics is concerning about how force affect the
motion of objects. Some simple games only need kinematics (e.g. Pong)
P.631 "Hence the physics system attems to provide realistic collision
responses ... interpenetrating."
With continuous collision detection (CCD), it is possible to provide
collision responses without penetration.
P. 692 "For example, the vertices of a static triangle mesh ... during
rendering"
I think this example may not be appropriate. It does not save
per-vertex matrix multiplication as normally a vertex position need to
transform with view-projection matrix anyway.
Hi Jason,
I am glad to tell you that, since last year, I have been translating
your book into Chinese for publishing in China. Actually this is my
first time to work on translation work.
I will send you some issues I found in these few days. And I hope I
can ask you questions when I encounter difficulties in your book.
Thanks.
____________
Hi Jason,
I have only translated till chapter 5 at the mean time. So most
suggestions in the following are in the first few chapters. Feel free
to further discuss.
Chapter 1
P.18 "... called a Brawler. This kind of fighting game can have
technical requirements more akin to those of a first-person shooter or
..."
I think "third-person games" is even more similar.
P.19 "Cruisin' USA"
It should be "Cruis'n USA"
P.23 AOL's "Neverwinter Nights" is MMORPG but Bioware's "Neverwinter
Nights" is not. It may be confusing.
P.24 "Populus"
It should be "Populous"
P.28 "OGRE 3D is a ..."
The official name is OGRE, no 3D. There are many occurance of "OGRE
3D" in the book.
P.28 "Torque" is not one of open-source engines. It should be
categorized as other commercial engine.
P.29 I think "3rd party SDKs" may be difficult to fit in the diagram,
since they can be anywhere in the architecture. For example, Physics
engiine were shown in the 3rd party SDKs but they are also in
collision & Physics box.
P.32 "... hardware transform and lighting (hardware T & L) which began
with DirectX 8.
It should be "DirectX 7".
P.39 "high dynamic range (HDR) lighting and bloom"
More precisely, only HDR tone mapping is a post-process effect. HDR
lighting occurs in the lighting process.
P.43 "WiiMote"
Nintendo's official term is "Wii Remote"
Chapter 2
P.71 "_GNUC_"
It should be "__GNUC__". It may also mention about "__GNUC_MINOR__".
P. 74 "Of these, the three we will use most are"
the four
P. 74 "General Tab"
Officially should be "General Property Page". Also the same in other
sections as well.
P.74 "Configuration Propertoes/C++"
It should be "C/C++"
P. 74 "Output directory ... that the compiler/linker ultimately outputs"
Strictly speaking should be linker only.
P. 76 "$VCInstallDir). The directory in which Visual Studio's standard
C library is currently installed.
I think it should be pointed to the Visual C++ installation directory.
And there are tools, headers and binaries for VC under that directory.
P.76 "General Tab/Include Directories"
"Additional Include Directories"
P.76 "General Tab/Debug Information"
"Debug Information Format"
P.79 "However, you cannot debug more than one program at a time."
This is not true. http://msdn.microsoft.com/en-
P.79 "Hitting F5 ... will run the .exe built by the start-up project"
This may not be precise. As VC runs the "Command" property in the
Debugging Property Page of the start-up project, and the default value
is $(RemotePath). But this can be any executable or dump file.
P.79 "Break point"
I think the term should be a single word "breakpoint".
http://en.wikipedia.org/wiki/
P.81 "You can cast variables from one type to another... For example,
... as a floating-point value."
I think this example is not good because without the explicit cast
(float), C/C++ will implicitly cast myIntegerVariable to float for the
multiplication with another float. My suggestion is "(float)a/b" to
show the floating point ratio between two integer variables.
P.81 "...inspect the rotation angle of any quaternion from within the debugger."
Is "from" an extra word?
P.99 "To represent a signed integer in 32 bits ... "
This paragraph may mention the term "sign-and-magnitude method".
Chapter 3
P.106 "This is because games are usually developed on a PC or Linux."
Should be better "Windows on Linux" or just "PC".
P.119 "...and later returned to the pool for use by other programs by
calling free()."
"Other programs" seems inaccurate for most OS today. malloc() and
free() are concerning the heap of a process. They should not affect
"other programs".
P.119 "In C++, the global new and delete ... to and from the heap."
More precisely, C++ terminology use free store instead of heap.
http://www.gotw.ca/gotw/009.
P.132 "Structured exception handling (SEH) is a very powerful feature of C++".
SEH is a feature of Win32. C++ standard uses the term "exception handling".
Chapter 4
P. 162 the final position of the jet's left wingtip in model space is ..."
It should be "world" space.
Chapter 5
P.210 "On the PS3, should be 128-bit aligned for maximum DMA
throughput, meaning they can only end in the bytes 0x00 or 0x80."
If the address can only end in the bytes 0x00 or 0x80, then it should
be 128-"byte" aligned.
P.213 "...support single- and double-buffered allocators."
The title and the other text use "single-frame allocator". I suggest
to use the consistent term "single-frame allocator".
P.223 "That is, the linker never splits up a compiled translation unit
(.obj file) ..."
In VC2008, it supports function level linking, which can splits up a
compiled translation unit.
P.223 So, following ... to avoid D-cache misses"
It should be "I-cache".
P.224 "Priority queue. ... A priority queue is typically implemented
as a binary search tree (e.g. std::priority_queue).
Priority queue is more often implemented as heap data structure.
http://en.wikipedia.org/wiki/
It should not be "thought of a list that stays sorted at all times",
because it only permits dequeue for the maximum/minimum element, but
not traversing the content.
P.227 "If a divided-and-conquer approach is used, as in a binary
search, ..., only log_2 n elements will actually be visited by the
algorithm on average..."
It should be "on the worst case" instead of "on average".
P.248 "The UTF-16 standard ... Each character takes up exactly 16 bits."
"UTF-16 ... produces a variable-length result of either one or two
16-bit code units per code point." http://en.wikipedia.org/wiki/
P.249 "Under Microsoft Windows, the data type wchar_t is used to
represent a single "wide" UTF-16 character (WCS)"
There are several issues in this sentence. First wchar_t is a standard
C++ type, it is unrelated to Windows (OS). Second, the character set
used in wchar_t is undefined in C++.
http://en.wikipedia.org/wiki/
Chapter 10
P. 421 "... DXT ... the basic idea is to break the texture into 2x2
blocks of pixels"
It should be 4x4
P.427 "The ambient term ... is a gross approximation of the amount of
indirect bounced light present in the scene"
Ambient light may not be only indirect lights but the light arrive at
the surface at almost all directions. It is common to use ambient term
to approximate skylight.
P. 428
k_A, k_D, k_S normally should be a vector (RGB color), and use bold
face as C_i.
P.442 "The basic idea is to break a triangle ... MSAA does not require
a doububle-width frame buffer."
The section is not very precise. It only need to run fragment shader
on multiple subsample fragments at the edge of polygons. Pixels that
are completely inside a polygon only run fragment shader once per
pixel.
Also, it does not mention that MSAA require 4x (or more according to
the level of MSAA) memory for depth and stencil buffer.
P.443 "The depth buffer ... typically contains 16- or 24-bit floating
point depth information..."
Typically depth buffer is in integer format. Direct3D 9 has only
32-bit floating point depth buffer, others are integer formats.
P.443 "When a fragment's color is written into the frame buffer, it depth..."
"its depth"
P.447 "Both FxComposer and Unreal Engine 3 provide powerful graphical
shading languages."
FxComposer doesn't. It only provides traditional text-based shading language.
P. 461 "For each frustum plane, we move the plane inward a distance
equal to the radius, ... inside the frustum"
It should move the plane "outward" by the distance equal to the
radius. If all inward tests are true, then the sphere "may be" inside
the frustum.
For details, see "Real-time Rendering 3rd Edition". Actually only need
to evaluate the plane equation of the center of sphere and then
compare to r for culling purpose.
P.480 "In traditional triangle-rasterization-based rendering, all
lighting and shading calculations are performed on the triangle
fragments in view space."
More precisely, lighting is often performed in world, view or tangent space.
P. 481 "A typical G-buffer might contain ... depth, surface normal in
clip space"
I think more often they are stored in view or world space. Normal in
clip space are difficult to do lighting computation.
P. 482 "... including vignette (slight blur around the edge of screen)..."
Vignette often means reduction of brightness and saturation at the
edge of screen.
P.484 sky rendering
I think, if the sky is rendered before other objects, then z-test and
z-write should be turned off to reduce bandwidth.
But for current gen hardware, I think sky should be better rendered
last with z-test on and z-write off, so it reduce the pixel shading
overhead at the occluded sky area.
Chapter 11
P. 523 the 2nd equation
w_ij should be w_i, and K_i should be K_j_i.
Chapter 12
P630. "... known as mechanics. This is the study of how forces affect
the behavior of objects. In a game engine, we are particularly
concerned with dynamics of objects -- how they move over time"
There are several issues. Mechanics is a big branch in physics, its
subfields include kinematics, dynamics, quantum mechanics, etc.
Second, kinematics is study of motion of objects (without concerning
about force), while dynamics is concerning about how force affect the
motion of objects. Some simple games only need kinematics (e.g. Pong)
P.631 "Hence the physics system attems to provide realistic collision
responses ... interpenetrating."
With continuous collision detection (CCD), it is possible to provide
collision responses without penetration.
P. 692 "For example, the vertices of a static triangle mesh ... during
rendering"
I think this example may not be appropriate. It does not save
per-vertex matrix multiplication as normally a vertex position need to
transform with view-projection matrix anyway.
Errata
Hello,
I recently purchased a copy of your book. After having read through
about half of the book so far I wanted to send you some comments. I
thought about posting this in a review on Amazon, but rather than
diminsh your otherwise excellent book I thought it better to simply
write you instead and perhaps this information can be included in a
future printing, or published on your website as errata. Because to
be honest the book is really good otherwise and I think that despite
the following issues it deserves an excellent review.
Anyway My comments are related to the information you provide on
exception handling in chapter 3. For starters, you should be careful
about using the term "SEH" because technically SEH is very specific to
the Windows platform and is a native OS-level system service. C++
exception handling, on the other hand, is a *completely* different
implementation of an exception handling mechanism, which may or may
not be implemented using SEH by the compiler. Obviously if your code
is being compiled for any platform other than Windows, then the
compiler is obviously not using SEH. In the book I believe you are
actually referring to C++ exception handling.
Secondly, you mention on page 132 that "SEH (sic) adds a lot of
overhead to the program. Every stack frame must be augmented to
contain additional information required by the stack unwinding
process. Also, the stack unwind is usually very slow -- on the order
of two to three times more expensive than simply returning from the
function. Also, if even one function in your program (or a library
that your program links with) uses SEH, your entire program must use
SEH. The compiler can't know which functions might be above you on
the call stack when you throw an exception."
This is entire paragraph is almost completely wrong. The only stack
frames that are augmented at all are ones that contain a catch block.
To demonstrate proof of this, consider the following small program.
#include <iostream>
#include <exception>
void test1();
void test2();
void test3();
void __declspec(noinline) test1()
{
test2();
}
void __declspec(noinline) test2()
{
try { test3(); }
catch(const std::exception& e)
{
std::cout << "Got an exception in test2" << std::endl;
}
}
void __declspec(noinline) test3()
{
throw std::runtime_error("Error");
}
int _tmain(int argc, _TCHAR* argv[])
{
test1();
return 0;
}
Now let's take a look at the assembly language generated by the
compiler. The following is simply the above code repeated but with
assembly inlined at each sequence point. It's not necessary to give
the assembly anything more than a cursory glance just to see how much
code the compiler is generating in each function.
void __declspec(noinline) test1()
{
test2();
004010C0 jmp test2 (4010D0h)
}
void __declspec(noinline) test2()
{
004010D0 push ebp
004010D1 mov ebp,esp
004010D3 push 0FFFFFFFFh
004010D5 push offset __ehhandler$?test2@@YAXXZ (401E90h)
004010DA mov eax,dword ptr fs:[00000000h]
004010E0 push eax
004010E1 mov dword ptr fs:[0],esp
004010E8 sub esp,8
004010EB push ebx
004010EC push esi
004010ED push edi
004010EE mov dword ptr [ebp-10h],esp
try { test3(); }
004010F1 mov dword ptr [ebp-4],0
004010F8 call test3 (401140h)
catch(const std::exception& e)
{
std::cout << "Got an exception in test2" << std::endl;
004010FD mov eax,dword ptr [__imp_std::endl (402038h)]
00401102 mov ecx,dword ptr [__imp_std::cout (402054h)]
00401108 push eax
00401109 push offset string "Got an exception in test2" (40215Ch)
0040110E push ecx
0040110F call std::operator<<<std::char_ traits<char> > (401330h)
00401114 add esp,8
00401117 mov ecx,eax
00401119 call dword ptr
[__imp_std::basic_ostream< char,std::char_traits<char> >::operator<<
(40204Ch)]
}
0040111F mov eax,offset $LN7 (401125h)
00401124 ret
}
00401125 mov ecx,dword ptr [ebp-0Ch]
00401128 pop edi
00401129 pop esi
0040112A mov dword ptr fs:[0],ecx
00401131 pop ebx
00401132 mov esp,ebp
00401134 pop ebp
00401135 ret
void __declspec(noinline) test3()
{
00401140 push ebp
00401141 mov ebp,esp
00401143 and esp,0FFFFFFF8h
00401146 push 0FFFFFFFFh
00401148 push offset __ehhandler$?test3@@YAXXZ (401E82h)
0040114D mov eax,dword ptr fs:[00000000h]
00401153 push eax
00401154 mov dword ptr fs:[0],esp
0040115B sub esp,4Ch
throw std::runtime_error("Error");
0040115E push offset string "Error" (402178h)
00401163 lea ecx,[esp+8]
00401167 call dword ptr
[__imp_std::basic_string<char, std::char_traits<char>,std:: allocator<char>
>::basic_string<char,std:: char_traits<char>,std:: allocator<char> >
(40203Ch)]
0040116D lea ecx,[esp+20h]
00401171 mov dword ptr [esp+54h],0
00401179 call dword ptr [__imp_std::exception:: exception
(4020E4h)]
0040117F lea eax,[esp+4]
00401183 mov byte ptr [esp+54h],1
00401188 push eax
00401189 lea ecx,[esp+30h]
0040118D mov dword ptr [esp+24h],offset
std::runtime_error::`vftable' (4021A0h)
00401195 call dword ptr
[__imp_std::basic_string<char, std::char_traits<char>,std:: allocator<char>
>::basic_string<char,std:: char_traits<char>,std:: allocator<char> >
(402040h)]
0040119B push offset __TI2?AVruntime_error@std@@ (402410h)
004011A0 lea ecx,[esp+24h]
004011A4 push ecx
004011A5 mov byte ptr [esp+5Ch],0
004011AA call _CxxThrowException (401E00h)
$LN10:
004011AF int 3
}
int _tmain(int argc, _TCHAR* argv[])
{
test1();
004011B0 call test1 (4010C0h)
return 0;
004011B5 xor eax,eax
}
Note that in neither main nor in test1() is there any code having to
do with exception handling. The reason this works is that it's true
that the compiler does not know what functions will be on the
callstack at the time the exception is thrown, but it *does* know what
functions *might* try to handle exceptions. So in each of these
functions, it generates code to modify the exception handling chain.
What really happens when the stack unwinds is that it starts walking
through the stack frames and the exception handling chain in parallel.
If a stack frame is found has no entry at all in the chain, or it has
one or more entries that don't match the current exception, it simply
calls all destructors for constructed objects and then moves up the
stack until it finds one or the program terminates. But there is no
extra code generated in any of these functions. These destructors
would have had to have been called anyway even if the function
terminated normally.
That's my biggest comment. My final comment is in the early chapters
when you're discussing visual studio and different types of builds:
debug, release, production, and hybrid. At one point you mention that
systems like gnu make make it easy to define certain options on a
per-translation unit basis, but that this is very difficult in Visual
Studio. In fact it's very easy! Right click a cpp file in the
solution explorer, click properties, and bam. Any settings you make
in that window are applied only to that translation unit. You can
change any setting that you could normally change on a per-project
basis, as long as it is not a linker setting. Preprocessor,
optimization, etc are all changable on a per-translation unit basis
though.
Aside from these comments, however, the book is definitely a
refreshing addition to the sometimes dilluted market for game engine
books. Too many books try to cash in on the game craze and while it's
clear the authors have experience, the books are not rigorous enough
to leave one satisfied. I like the encyclopedic approach taken in
this book, and I'd definitely be interested in seeing an additional
volume at some point in the future.
Regards,
Zachary Turner
I recently purchased a copy of your book. After having read through
about half of the book so far I wanted to send you some comments. I
thought about posting this in a review on Amazon, but rather than
diminsh your otherwise excellent book I thought it better to simply
write you instead and perhaps this information can be included in a
future printing, or published on your website as errata. Because to
be honest the book is really good otherwise and I think that despite
the following issues it deserves an excellent review.
Anyway My comments are related to the information you provide on
exception handling in chapter 3. For starters, you should be careful
about using the term "SEH" because technically SEH is very specific to
the Windows platform and is a native OS-level system service. C++
exception handling, on the other hand, is a *completely* different
implementation of an exception handling mechanism, which may or may
not be implemented using SEH by the compiler. Obviously if your code
is being compiled for any platform other than Windows, then the
compiler is obviously not using SEH. In the book I believe you are
actually referring to C++ exception handling.
Secondly, you mention on page 132 that "SEH (sic) adds a lot of
overhead to the program. Every stack frame must be augmented to
contain additional information required by the stack unwinding
process. Also, the stack unwind is usually very slow -- on the order
of two to three times more expensive than simply returning from the
function. Also, if even one function in your program (or a library
that your program links with) uses SEH, your entire program must use
SEH. The compiler can't know which functions might be above you on
the call stack when you throw an exception."
This is entire paragraph is almost completely wrong. The only stack
frames that are augmented at all are ones that contain a catch block.
To demonstrate proof of this, consider the following small program.
#include <iostream>
#include <exception>
void test1();
void test2();
void test3();
void __declspec(noinline) test1()
{
test2();
}
void __declspec(noinline) test2()
{
try { test3(); }
catch(const std::exception& e)
{
std::cout << "Got an exception in test2" << std::endl;
}
}
void __declspec(noinline) test3()
{
throw std::runtime_error("Error");
}
int _tmain(int argc, _TCHAR* argv[])
{
test1();
return 0;
}
Now let's take a look at the assembly language generated by the
compiler. The following is simply the above code repeated but with
assembly inlined at each sequence point. It's not necessary to give
the assembly anything more than a cursory glance just to see how much
code the compiler is generating in each function.
void __declspec(noinline) test1()
{
test2();
004010C0 jmp test2 (4010D0h)
}
void __declspec(noinline) test2()
{
004010D0 push ebp
004010D1 mov ebp,esp
004010D3 push 0FFFFFFFFh
004010D5 push offset __ehhandler$?test2@@YAXXZ (401E90h)
004010DA mov eax,dword ptr fs:[00000000h]
004010E0 push eax
004010E1 mov dword ptr fs:[0],esp
004010E8 sub esp,8
004010EB push ebx
004010EC push esi
004010ED push edi
004010EE mov dword ptr [ebp-10h],esp
try { test3(); }
004010F1 mov dword ptr [ebp-4],0
004010F8 call test3 (401140h)
catch(const std::exception& e)
{
std::cout << "Got an exception in test2" << std::endl;
004010FD mov eax,dword ptr [__imp_std::endl (402038h)]
00401102 mov ecx,dword ptr [__imp_std::cout (402054h)]
00401108 push eax
00401109 push offset string "Got an exception in test2" (40215Ch)
0040110E push ecx
0040110F call std::operator<<<std::char_
00401114 add esp,8
00401117 mov ecx,eax
00401119 call dword ptr
[__imp_std::basic_ostream<
(40204Ch)]
}
0040111F mov eax,offset $LN7 (401125h)
00401124 ret
}
00401125 mov ecx,dword ptr [ebp-0Ch]
00401128 pop edi
00401129 pop esi
0040112A mov dword ptr fs:[0],ecx
00401131 pop ebx
00401132 mov esp,ebp
00401134 pop ebp
00401135 ret
void __declspec(noinline) test3()
{
00401140 push ebp
00401141 mov ebp,esp
00401143 and esp,0FFFFFFF8h
00401146 push 0FFFFFFFFh
00401148 push offset __ehhandler$?test3@@YAXXZ (401E82h)
0040114D mov eax,dword ptr fs:[00000000h]
00401153 push eax
00401154 mov dword ptr fs:[0],esp
0040115B sub esp,4Ch
throw std::runtime_error("Error");
0040115E push offset string "Error" (402178h)
00401163 lea ecx,[esp+8]
00401167 call dword ptr
[__imp_std::basic_string<char,
>::basic_string<char,std::
(40203Ch)]
0040116D lea ecx,[esp+20h]
00401171 mov dword ptr [esp+54h],0
00401179 call dword ptr [__imp_std::exception::
(4020E4h)]
0040117F lea eax,[esp+4]
00401183 mov byte ptr [esp+54h],1
00401188 push eax
00401189 lea ecx,[esp+30h]
0040118D mov dword ptr [esp+24h],offset
std::runtime_error::`vftable' (4021A0h)
00401195 call dword ptr
[__imp_std::basic_string<char,
>::basic_string<char,std::
(402040h)]
0040119B push offset __TI2?AVruntime_error@std@@ (402410h)
004011A0 lea ecx,[esp+24h]
004011A4 push ecx
004011A5 mov byte ptr [esp+5Ch],0
004011AA call _CxxThrowException (401E00h)
$LN10:
004011AF int 3
}
int _tmain(int argc, _TCHAR* argv[])
{
test1();
004011B0 call test1 (4010C0h)
return 0;
004011B5 xor eax,eax
}
Note that in neither main nor in test1() is there any code having to
do with exception handling. The reason this works is that it's true
that the compiler does not know what functions will be on the
callstack at the time the exception is thrown, but it *does* know what
functions *might* try to handle exceptions. So in each of these
functions, it generates code to modify the exception handling chain.
What really happens when the stack unwinds is that it starts walking
through the stack frames and the exception handling chain in parallel.
If a stack frame is found has no entry at all in the chain, or it has
one or more entries that don't match the current exception, it simply
calls all destructors for constructed objects and then moves up the
stack until it finds one or the program terminates. But there is no
extra code generated in any of these functions. These destructors
would have had to have been called anyway even if the function
terminated normally.
That's my biggest comment. My final comment is in the early chapters
when you're discussing visual studio and different types of builds:
debug, release, production, and hybrid. At one point you mention that
systems like gnu make make it easy to define certain options on a
per-translation unit basis, but that this is very difficult in Visual
Studio. In fact it's very easy! Right click a cpp file in the
solution explorer, click properties, and bam. Any settings you make
in that window are applied only to that translation unit. You can
change any setting that you could normally change on a per-project
basis, as long as it is not a linker setting. Preprocessor,
optimization, etc are all changable on a per-translation unit basis
though.
Aside from these comments, however, the book is definitely a
refreshing addition to the sometimes dilluted market for game engine
books. Too many books try to cash in on the game craze and while it's
clear the authors have experience, the books are not rigorous enough
to leave one satisfied. I like the encyclopedic approach taken in
this book, and I'd definitely be interested in seeing an additional
volume at some point in the future.
Regards,
Zachary Turner
Subscribe to:
Posts (Atom)