Thoughts Serializer

Computer graphics, Games, Personal

From Python to Lua

2 Comments »

(This blog was originaly posted at #AltDevBlogADay)

All game developers, sooner or later, learn to appreciate scripting languages. That magical thing that allows for letting others do your job, better scaling of the team, strengthening the game code/engine separation, sandboxing, faster prototyping of ideas, fault isolation, easy parametrization, etc. Every game has to be somehow data driven to be manageable, and stopping at simple configuration files, with many different custom parsers, without going the extra mile of adding a full scripting language, is 90% of the times a bad design choice. 

Today the developer can choose from a large variety of scripting languages, or even go crazy and implement one on his own. It happens that the most favored language for game developers is Lua. Its easy to understand why Lua is the favorite but other options are used as well. For example Python and the lately upcoming force of  Javascript.

Here I would like to share some of my experience of moving a game engine from Python to Lua. Read the rest of this entry »

Optimizing script language performance with custom memory allocators

4 Comments »

The last weekend I did some exploration on the script language execution performance. Specifically on the memory allocation side of things, and I would like to share my findings.

Script languages and memory usage

As you probably know script languages (most of them at least, like Python, Lua, etc) have the tendency to make a huge amount of small allocations on the heap. Almost everything is stored on the heap, and if you care for performance, you start to feel homesick about your beloved C stack! Anyway, nothing comes for free, and scripting languages have to take something from you in exchange for all the goods it gives you back. So the best you can do is make sure that you have the best memory allocator for the job.

Doing too many small allocations and releases on the heap can create memory fragmentation, along with all the evil that comes with this. The common approach is to create a specialized memory allocator that serves small and constant in size blocks of memory to the scripting language, taken from a bigger chunk of memory reserved from the system. This is a common in all “realtime” and intencive applications like games, and something I did many times to gain performance.

Can’t beat the standard malloc

What I discovered with my latest attempt was that it has gotten quite hard to beat the GNU implementation of malloc(). Something that used to be easy in the past when you focused on a specialized case (e.g. small blocks of memory). Not that you can’t do better if you try hard, but at this point the malloc() implementation is already super-fast for 99.9% of applications on the desktop. Rest asured that you will not be able to do much better. However that is not the case for embedded devices that don’t share the same virtual memory benefits as the desktop computers.

My hand tuned specialized memory allocator for small blocks of memory ( <= 256bytes ) was not able to be more that 1% faster that the native malloc() on the OS X 10.6. However on the iPhone the same allocator was twice as fast as the native malloc() ! Since the target was from the begining the iPhone that seemed like big win! However when I set up a small benchmark in the scripting environment that did some allocations of game engine objects and released then again in various patterns, the results were disappointing. The gain from using my specialized (and twice as fast) allocator resulted in improvement of about 5% in execution speed in a memory intensive benchmark. And at some tests even slower! That was odd and most of all not good!

Why I was failing

After some inspections and tests that made the case of me doing something really stupid less probable, I narrowed down the cause.

In most cases of using a scripting language you have some classes defined in C++ that you instatiate in the scripting language. Take for example a 3D vector class “CVector3″ defined in C++. When you instatiate this in the script language you get two allocations. One in the scripting language that allocates the “proxy” object and one in the C++ environment. When giving a new allocator to the scripting language to do its allocations you only “optimize” the first allocation. The one in C++ still goes through the system default allocator.

And since you optimize half of the allocations you expect to have half the performance boost… well… wrong. It turns out that you can even be slower this way. The secret here is the CPU cache. By doing the above, you have two memory blocks that are usually accessed together, but are far apart in memory. This can really hurt performance badly on a device with slow memory like the iPhone.

The solution

The solution was of course to use the same allocator on the C++ side by overriding the “new” operator of the class. This made the blocks of memory allocated on the script side to be close to the block allocated on the C++ side. This way access to the object only involves accessing one part of the memory and giving nice cache hits. Performance up by 30%, which was nice and expected.

One other interesting thing that I found from this is that, on the iPhone, if I just override the “new” operator of a class and make it allocate the memory with plain malloc() and don’t use my allocator at all, the system is again faster!

This is probably from the fact that “new” does not go through plain malloc() (didn’t bother to check) as the scripting language environment does. So the allocated blocks end up in differect arenas at different parts of the memory, with the result of losing performance for the same reason as above!

So, keep your related allocations close together when crossing the language barrier!

Having both Mac OSX and iPhone targets in XCODE

1 Comment »

I have a library (guess what that is!) for use in the iPhone. For the library I have an XCode project with two targets. One for the emulator and one for the actual device. Nothing wild as you can see.

However the joy came when I desided to also use the library in OSX. As one would guess I went on and created a new target for OSX, which started out as a copy of the original iPhone, target but with the “base SDK” for the specific target switched to “Mac OS X 10.6″.

When I compiled for the first time, I came to the realization that even with this option set to Mac OS X 10.6, xcode was still compiling for the iPhone! From this point on, nothing worked. I tried every option, setting, for the project and the target.. but nothing. XCode seemed locked to compile for the iPhone, totally ignoring the SDK setting. I ended up trying quiting/starting again XCode, restarting the MAC… and some other arcane spells and voodoo I can’t really confess here… nothing… Then I gave up and dreamed of the nice days of scons and even make files, when you knew what was under the hood..

All that until today when in a moment of enlightenment, I clicked on the “Overview” dropdown of XCode with the ALT key pressed. And “boom” (as Jobs would say), there is was… “the choise”! By holding ALT when selecting the “Overview dropdown, XCode allows you to choose the active SDK!  This was so overwhelming for me that I tweeted about it and also desided to make a blog post, so that no one has to go through what I did.

So bottom line for both OS X and iPhone targets:

  1. Make a new target for OS X and set it up.
  2. ALT-click the Overview to select your active SDK.
  3. Compile.
  4. Have a nice day!

Python Easter Egg

2 Comments »

Since Easter is coming, here is an easter egg for you :

If you type from __future__ import braces in Python..

You get :

SyntaxError: not a chance

:D :D :D

Ray Tracing into a Sparse Voxel Octree

6 Comments »

And just when you thought you were through with tracing things all over the place… John Carmack strikes back with a mortal blow with something about ray tracing into a sparse voxel octree!!

The article doesn’t really say much (nothing actually) about the algorithm, and this is where the fun/fuss starts! I can’t wait to see all the amazing/crazy ideas people from all over world will come up with, about what John is actually talking about. Plots over plots will emerge.. flames.. Read the rest of this entry »

NVIDIA to Acquire AGEIA Technologies

No Comments »

According to this press release, nVidia will acquire Ageia Technologies. Yeap! The well known physics software and hardware vendor. In my mind this means that the future nVidia based accelerators will support physics acceleration, too. It will basically mean the death of the PhysX processor, since the GPU can do that easily with no extra cost.

Actually the PPU solution was never to work. I find it quite hard to believe people would ever Read the rest of this entry »

Software Engineering Proverbs

1 Comment »

Must read.. :)

One I really liked :

Q: How many QA testers does it take to change a lightbulb?
A: QA testers don’t change anything. They just report that it’s dark.

My Βabbler Coldfusion

2 Comments »

If you where a compiler and you would like to express yourself about an index that was mistakenly out of bounds what would you say?

I would say something like this : - “Line 342 : Index out of bounds”

Read on to find out what Coldfusion would say!!! Read the rest of this entry »

Double Check Your EXCEL

1 Comment »

Do you trust Microsoft Excel to do your financial plans? Do you count on it for business matters? Do you even use it as a calculator?

Think again… since Excel can’t even multiply correctly!!

I wonder how they accomplished such a thing! What kind of COM, CORBA and VB should you be calling for a multiplication to make it have a bug!?

p.s. If I recall correctly there was a bug like that in the calculator that came with a previous version of Windows…

microsoft, excel, bug, multiply

Getting Started Again… True Megatexture

2 Comments »

Greetings everybody!

It been a long time since this blog was updated, I know! Well I was kind of busy lately. Looking for a job, finding a job, then doing the job and finally trying to get some free time for summer vacation and free time projects. You know that getting a new job always makes things a bit harder until you get comfortable. The good thing is that now work has stabilized and I finally start to have some free time.

I’m not going to write a lot here at this time. I just wanted to let you know that I’m well and good. I’m also planing to start working on Sylphis3D again in a more committed and persistent manner.

Regarding Sylphis3D I would also like to inform you that I’m working on something big! You probably know my opinion on megatexture on terrain. Well I managed to think of an algorithm to apply megatextures on any kind of geometry. Yes you heard correct! The new Sylphis3D will have totally virtual texturing.

You will be able to apply any size texture on any kind of mesh with no impact on performance! Would you like to apply a 4096×4096 texture on that talisman the player wears just in case someone gets so close to see it? It ok! It doesn’t hurt! Go ahead! It’s up to you! At the moment I am at implementation stage and things seem that will work out just fine! Stay tuned…