• How do you optimize a game for specific hardware?
    2 replies, posted
Often I hear that more performance can be squeezed out of consoles over their lifecycle (most notably in exclusives) because the games can be programmed to take advantage of the fact that they are working with hardware that is not likely to be different per console, whereas this is harder to do on other platforms such as PC or Android, because they have widely varying hardware. Thus, I have some questions: - How true is this? - If so, does it hold true even on PC or Android? For example, could I optimize a game to be more performant only with the hardware present in my PC, or with a specific phone model? - If so, what topics must one learn to do such optimizations? I might be asking something that delves into extremely low-level graphics programming, so I'm not expecting any code snippets or anything like that. But I'd just like a point in the right direction as far as knowing what to research.
This is a tough question. Here are a few things you can consider: Yes, it is possible to optimize for a certain piece of hardware. For example, consider hard constraints like ram size (don't use more than that or else you will have to use a swap file and it will be much slower). I worked on an ARM processor with 32kb of instruction RAM meaning that a program that was 32kb or less would be more performant (in the general case) than one that is more than 32kb. Certain operations are faster one some CPUs/GPUs than others (maybe one CPU has a very fast floating points while another doesn't actually support floating points in hardware, e.g. some ARM processors). There are other optimizations to consider for something like the GPU. Certain GPUs perform certain operations much faster, so doing something one way can have large performance impacts compared to doing it a different way (maybe switching shaders is a lot slower than just having run time branches for a certain processor). Nothing is fundamentally different in these kind of optimizations between consoles and PC/phones, so yes these sorts of things are applicable. However, as you stated, there is a lot of variance in hardware for these platforms and so this is more rarely done.
expanding on nuke's answer (mostly for CPU), just for further research: Certain CPUs may have extra coprocessors or vector units (eg. SSE) that may allow specific operations to be both or either 1. processed at less cycles/less memory handling, 2. less CPU pipeline saturating on less instructions and/or branching. I feel like threading for games is too big to cover/be specific on (especially when considering complex architectures such as CELL/EmotionEngine), but worth looking on. In general multithreading on games tend to be a queue system that assigns operations to different threads. Worth considering are things offloading processing to GPU, such as PhysX, which may be architecture depending on your targets. Speaking of handling memory, it can get quite complex. Allocating can cost quite a bit of your cycles if not optimized - block allocators are a common way to do so as to reduce time spent on heap seeking (expanding on it, there may be tricks you can rely on for byte alignment, but I'm not too aware of this case). CPU caches could also be looked on as well as data alignment, though results will vary a lot once you go down this path, as well as if optimizing for speed/memory (which can also affect speed if bandwidth required for that data is smaller, since the data is properly "blocked" together). Seeking time may be entirely different from an optical drive vs disk as well, so streaming needs to be balanced such that you can fetch data reliably without stuttering/delays. I believe an example of a custom block allocator for seeking would be RAGE's "Megatextures" (virtual textures). Those are a few things one could begin to study on, out of curiosity.
Sorry, you need to Log In to post a reply to this thread.