Data oriented programming

Discussion:

Gabriel Sassone

2010-09-29 09:31:49 UTC

Hi everyone,
just wondering if anyone of you has some experiences with "Data-oriented
programming" stuff.
To push the limits of the PS3 you must use SPUs very well, and to do this I
found that the "data-oriented programming" is really useful and let you
create a very scalable framework.
There are some papers around (the most know is probably the one from Sony):

http://research.scee.net/files/presentations/gcapaustralia09/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf

I've experienced in small scale (on PS3) and found it really intriguing and
performance wise.
Also it is possible to use this mentality to improve performances on X360
and PC, almost all platforms!

What are your experiences with that?

Thanks!
Gabriel

Tony Albrecht

2010-09-30 00:47:34 UTC

Permalink

Hi Gabriel,
A lot of the work I do is related to optimisation of code or at least
producing code that has to perform well, so I use a lot of the concepts
mentioned in that paper extensively (and I should too, I wrote it :) ). The
key thing to consider is your data - how much of it you access and in what
order and where it goes. I find that just reordering data into contiguous
chunks has a positive effect, regardless of the state of the code using it.
Once the data is fixed, massaging the code that uses it into a better state
( SIMD becomes easier on contig data) improves performance even more.

Taking an existing code base and retrofitting DOD concepts into it can
create some pretty nasty code, but designing with data in mind right from
the start can produce maintainable, readable, extensible code that runs
fast.

-Tony

On 29 September 2010 19:01, Gabriel Sassone <***@gmail.com> wrote:

> Hi everyone,
> just wondering if anyone of you has some experiences with
> "Data-oriented programming" stuff.
> To push the limits of the PS3 you must use SPUs very well, and to do this I
> found that the "data-oriented programming" is really useful and let you
> create a very scalable framework.
> There are some papers around (the most know is probably the one from Sony):
>
>
> http://research.scee.net/files/presentations/gcapaustralia09/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf
>
> I've experienced in small scale (on PS3) and found it really intriguing and
> performance wise.
> Also it is possible to use this mentality to improve performances on X360
> and PC, almost all platforms!
>
> What are your experiences with that?
>
> Thanks!
> Gabriel
>
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>
>

Mike Acton

2010-09-30 01:47:14 UTC

Permalink

I'll add my two cents to what Tony said: I personally think the most
important thing to keep in mind is that fundamentally the goal of *any *program
(at any level of the hierarchy - from application to function to single
instruction) is to transform data from one form into another. Understanding
that data being transformed is the key to doing that well. The reality is
that it's always a mix of mental models, but the real *problem* in practice
is that not only is there no real understanding of the data being
transformed, but it's ignored *completely* - as though "data" is some
abstract concept that can be ignored.

Mike.

On Wed, Sep 29, 2010 at 5:47 PM, Tony Albrecht <***@gmail.com>wrote:

> Hi Gabriel,
> A lot of the work I do is related to optimisation of code or at least
> producing code that has to perform well, so I use a lot of the concepts
> mentioned in that paper extensively (and I should too, I wrote it :) ). The
> key thing to consider is your data - how much of it you access and in what
> order and where it goes. I find that just reordering data into contiguous
> chunks has a positive effect, regardless of the state of the code using it.
> Once the data is fixed, massaging the code that uses it into a better state
> ( SIMD becomes easier on contig data) improves performance even more.
>
> Taking an existing code base and retrofitting DOD concepts into it can
> create some pretty nasty code, but designing with data in mind right from
> the start can produce maintainable, readable, extensible code that runs
> fast.
>
> -Tony
>
> On 29 September 2010 19:01, Gabriel Sassone <***@gmail.com> wrote:
>
>> Hi everyone,
>> just wondering if anyone of you has some experiences with
>> "Data-oriented programming" stuff.
>> To push the limits of the PS3 you must use SPUs very well, and to do
>> this I found that the "data-oriented programming" is really useful and let
>> you create a very scalable framework.
>> There are some papers around (the most know is probably the one from
>> Sony):
>>
>>
>> http://research.scee.net/files/presentations/gcapaustralia09/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf
>>
>> I've experienced in small scale (on PS3) and found it really intriguing
>> and performance wise.
>> Also it is possible to use this mentality to improve performances on X360
>> and PC, almost all platforms!
>>
>> What are your experiences with that?
>>
>> Thanks!
>> Gabriel
>>
>> _______________________________________________
>> Sweng-Gamedev mailing list
>> Sweng-***@lists.midnightryder.com
>>
>> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>>
>>
>
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>
>

Mat Noguchi

2010-09-30 16:49:58 UTC

Permalink

Incidentally, Rico Mariani posted this earlier this week:

http://blogs.msdn.com/b/ricom/archive/2010/09/27/less-loosely-coupled-than-meets-the-eye.aspx

I find it particularly funny since he's chief something in Visual Studio and has significant experience with managed code, but he has a similar perspective on performance even from the managed code perspective.

MSN

From: sweng-gamedev-***@lists.midnightryder.com [mailto:sweng-gamedev-***@lists.midnightryder.com] On Behalf Of Mike Acton
Sent: Wednesday, September 29, 2010 6:47 PM
To: sweng-***@midnightryder.com
Subject: Re: [Sweng-Gamedev] Data oriented programming

I'll add my two cents to what Tony said: I personally think the most important thing to keep in mind is that fundamentally the goal of any program (at any level of the hierarchy - from application to function to single instruction) is to transform data from one form into another. Understanding that data being transformed is the key to doing that well. The reality is that it's always a mix of mental models, but the real *problem* in practice is that not only is there no real understanding of the data being transformed, but it's ignored *completely* - as though "data" is some abstract concept that can be ignored.

Mike.
On Wed, Sep 29, 2010 at 5:47 PM, Tony Albrecht <***@gmail.com<mailto:***@gmail.com>> wrote:
Hi Gabriel,
A lot of the work I do is related to optimisation of code or at least producing code that has to perform well, so I use a lot of the concepts mentioned in that paper extensively (and I should too, I wrote it :) ). The key thing to consider is your data - how much of it you access and in what order and where it goes. I find that just reordering data into contiguous chunks has a positive effect, regardless of the state of the code using it. Once the data is fixed, massaging the code that uses it into a better state ( SIMD becomes easier on contig data) improves performance even more.

Taking an existing code base and retrofitting DOD concepts into it can create some pretty nasty code, but designing with data in mind right from the start can produce maintainable, readable, extensible code that runs fast.

-Tony
On 29 September 2010 19:01, Gabriel Sassone <***@gmail.com<mailto:***@gmail.com>> wrote:
Hi everyone,
just wondering if anyone of you has some experiences with "Data-oriented programming" stuff.
To push the limits of the PS3 you must use SPUs very well, and to do this I found that the "data-oriented programming" is really useful and let you create a very scalable framework.
There are some papers around (the most know is probably the one from Sony):

http://research.scee.net/files/presentations/gcapaustralia09/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf

I've experienced in small scale (on PS3) and found it really intriguing and performance wise.
Also it is possible to use this mentality to improve performances on X360 and PC, almost all platforms!

What are your experiences with that?

Thanks!
Gabriel

Richard

2010-09-30 19:32:24 UTC

Permalink

In article <***@bngexchange01.bungie.bng.local>,
Mat Noguchi <***@bungie.com> writes:

> Incidentally, Rico Mariani posted this earlier this week:
>
> http://blogs.msdn.com/b/ricom/archive/2010/09/27/less-loosely-coupled-than-me
ets-the-eye.aspx

This is why I've been recommending that performance tests be part of
your CI build. As soon as you have two components that each use 2/3
of the L2 cache, you'll know from your performance tests.

FitNesse is a nice way, but certainly not the only way, to orchestrate
these end-to-end tests. <http://www.fitnesse.org>
--
"The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
<http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/>

Legalize Adulthood! <http://legalizeadulthood.wordpress.com>

Fabian Giesen

2010-09-30 20:22:30 UTC

Permalink

On 9/30/2010 12:32 PM, Richard wrote:
>
> In article<***@bngexchange01.bungie.bng.local>,
> Mat Noguchi<***@bungie.com> writes:
>
>> Incidentally, Rico Mariani posted this earlier this week:
>>
>> http://blogs.msdn.com/b/ricom/archive/2010/09/27/less-loosely-coupled-than-me
> ets-the-eye.aspx
>
> This is why I've been recommending that performance tests be part of
> your CI build. As soon as you have two components that each use 2/3
> of the L2 cache, you'll know from your performance tests.

Your cache usage patterns (usually) depend on architectural decisions,
not implementation details. Any sort of integration testing (even if
you're doing CI) is too late a stage to detect architectural problems.
By that point you're already in trouble.

The whole point is that no development methodology will do the thinking
for you. You simply *will not* get good results if you break your system
down into little pieces and forget about the big picture. Details matter.

That's not arguing for Waterfall. Unexpected issues will bite you in
unexpected ways, and you will have to iterate your design. But if you
want high performance, you absolutely, positively have to start with a
clear picture of the data flows you're going to have. "Module X and Y
both use 2/3 of the L2 cache" is a completely useless value. How do they
use it? What are they using it for? Cache usage isn't additive. If they
both work on largely the same data, two modules that "use 2/3 of the L2
cache" taken together might "use 3/4 of the L2 cache". Are they both one
large loop going through a linear dataset, or are they randomly
traversing linked data structures? Are they writing to large output
buffers that don't get read for a while? (In which case you can use
streaming stores/evictions to kick them out of the cache as soon as
you're done). And so on.

None of that needs a detailed implementation to answer. But all of it
needs a good idea of the layout (and size) of your data structures, and
a knowledge of how they're used. "Module A magically transforms widgets
to gadgets" isn't good enough. If you don't know enough about the
behavior of your program to make reasonable back-of-the-envelope
calculations about how much memory you're touching (and where), you
don't know enough to make it run fast, period.

-Fabian

Richard

2010-09-30 20:36:50 UTC

Permalink

In article <***@gmx.net>,
Fabian Giesen <***@gmx.net> writes:

> On 9/30/2010 12:32 PM, Richard wrote:
> >
> > In article<***@bngexchange01.bungie.
bng.local>,
> > Mat Noguchi<***@bungie.com> writes:
> >
> >> Incidentally, Rico Mariani posted this earlier this week:
> >>
> >> http://blogs.msdn.com/b/ricom/archive/2010/09/27/less-loosely-coupled-than
-me
> > ets-the-eye.aspx
> >
> > This is why I've been recommending that performance tests be part of
> > your CI build. As soon as you have two components that each use 2/3
> > of the L2 cache, you'll know from your performance tests.
>
> Your cache usage patterns (usually) depend on architectural decisions,
> not implementation details. Any sort of integration testing (even if
> you're doing CI) is too late a stage to detect architectural problems.
> By that point you're already in trouble.

Not that I disagree, but if you're in trouble isn't it best to know as
soon as possible? That's all I'm asserting.
--
"The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
<http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/>

Legalize Adulthood! <http://legalizeadulthood.wordpress.com>

Fabian Giesen

2010-09-30 20:58:44 UTC

Permalink

On 9/30/2010 1:36 PM, Richard wrote:
>> Your cache usage patterns (usually) depend on architectural decisions,
>> not implementation details. Any sort of integration testing (even if
>> you're doing CI) is too late a stage to detect architectural problems.
>> By that point you're already in trouble.
>
> Not that I disagree, but if you're in trouble isn't it best to know as
> soon as possible? That's all I'm asserting.

Yes, it is best to know as soon as possible, and integration testing is
most emphatically not "as soon as possible" for architectural decisions.
Back-of-the-envelope calculations during the design stage are. Consider
them unit tests for your design if you're hell-bent on viewing
everything as an agile progress.

-Fabian

Richard

2010-09-30 21:06:48 UTC

Permalink

In article <***@gmx.net>,
Fabian Giesen <***@gmx.net> writes:

> > Not that I disagree, but if you're in trouble isn't it best to know as
> > soon as possible? That's all I'm asserting.
>
> Yes, it is best to know as soon as possible, and integration testing is
> most emphatically not "as soon as possible" for architectural decisions.
> Back-of-the-envelope calculations during the design stage are.

... but of course those calculations are done by error-prone humans
and can be done wrong.

And just because you decide on an architecture in the beginning
doesn't mean that other people on your team are adding things
consistent with that architecture.

There's nothing wrong with planning things up front.

CI tests ensure that reality is consistent with that plan.

Designs are propositions and postulates, not measured data.
--
"The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
<http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/>

Legalize Adulthood! <http://legalizeadulthood.wordpress.com>

Mat Noguchi

2010-09-30 21:13:31 UTC

Permalink

I think a great way to synthesize agile vs. planning is that if you are going to plan, you should also plan on validating that plan. And if you are going to be agile, realize that the fundamental property of agile is a learning process. Which means experimentation and reflection, not just experimentation.

MSN
-----Original Message-----
From: sweng-gamedev-***@lists.midnightryder.com [mailto:sweng-gamedev-***@lists.midnightryder.com] On Behalf Of Richard
Sent: Thursday, September 30, 2010 2:07 PM
To: sweng-***@midnightryder.com
Subject: Re: [Sweng-Gamedev] Data oriented programming

In article <***@gmx.net>,
Fabian Giesen <***@gmx.net> writes:

> > Not that I disagree, but if you're in trouble isn't it best to know as
> > soon as possible? That's all I'm asserting.
>
> Yes, it is best to know as soon as possible, and integration testing is
> most emphatically not "as soon as possible" for architectural decisions.
> Back-of-the-envelope calculations during the design stage are.

... but of course those calculations are done by error-prone humans
and can be done wrong.

And just because you decide on an architecture in the beginning
doesn't mean that other people on your team are adding things
consistent with that architecture.

There's nothing wrong with planning things up front.

CI tests ensure that reality is consistent with that plan.

Designs are propositions and postulates, not measured data.
--
"The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
<http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/>

Legalize Adulthood! <http://legalizeadulthood.wordpress.com>

Jon Watte

2010-10-05 22:03:50 UTC

Permalink

There are times when back-of-the-envelope calculations turn out to be wrong.
Or, perhaps more commonly, when implementation bugs cause systems to not
behave the way you planned them.
CI builds catching performance problems as soon as they are checked into the
code base seems quite preferrable to customers and reviewers catching bugs
when you ship :-)
I don't see why you think adding performance tests to a CI tester is wrong?
We have it, and have used it, and it's caught things for a long time. Even
things that we planned well, but ended up having either unexpected side
effects, or implementation problems.

Sincerely,

jw

--
Americans might object: there is no way we would sacrifice our living
standards for the benefit of people in the rest of the world. Nevertheless,
whether we get there willingly or not, we shall soon have lower consumption
rates, because our present rates are unsustainable.

On Thu, Sep 30, 2010 at 1:58 PM, Fabian Giesen <***@gmx.net> wrote:

> On 9/30/2010 1:36 PM, Richard wrote:
>
>> Your cache usage patterns (usually) depend on architectural decisions,
>>> not implementation details. Any sort of integration testing (even if
>>> you're doing CI) is too late a stage to detect architectural problems.
>>> By that point you're already in trouble.
>>>
>>
>> Not that I disagree, but if you're in trouble isn't it best to know as
>> soon as possible? That's all I'm asserting.
>>
>
> Yes, it is best to know as soon as possible, and integration testing is
> most emphatically not "as soon as possible" for architectural decisions.
> Back-of-the-envelope calculations during the design stage are. Consider them
> unit tests for your design if you're hell-bent on viewing everything as an
> agile progress.
>
> -Fabian
>
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>

Fabian Giesen

2010-10-05 22:33:59 UTC

Permalink

On 10/5/2010 3:03 PM, Jon Watte wrote:
> There are times when back-of-the-envelope calculations turn out to be wrong.
> Or, perhaps more commonly, when implementation bugs cause systems to not
> behave the way you planned them.
> CI builds catching performance problems as soon as they are checked into
> the code base seems quite preferrable to customers and reviewers
> catching bugs when you ship :-)
> I don't see why you think adding performance tests to a CI tester is
> wrong? We have it, and have used it, and it's caught things for a long
> time. Even things that we planned well, but ended up having either
> unexpected side effects, or implementation problems.

I didn't mean to suggest that you shouldn't test performance during
testing, just that it shouldn't be your "first line of defense".
Obviously you can't fix problems before you notice them, but a lot of
these problems are entirely avoidable if you think things through early
enough.

-Fabian

Gregory Junker

2010-10-06 03:48:23 UTC

Permalink

>
> I didn't mean to suggest that you shouldn't test performance during
> testing, just that it shouldn't be your "first line of defense".
> Obviously you can't fix problems before you notice them, but a lot of
> these problems are entirely avoidable if you think things through early
> enough.

Which can easily lead to "analysis paralysis"...

Greg

Tony Albrecht

2010-10-06 04:12:09 UTC

Permalink

So you are recommending that you shouldn't think about the performance
impact of your code before you write it?

I think you should. If you are working on code which is likely to have a
performance impact on a system, then you should think carefully about the
layout of data and code. Smart decisions at the start will save you a lot of
time in the long run. I've worked on systems that are incredibly difficult
to optimise due to uniformly slow code where a week spent fiddling around
during the design of the system would have saved months down the track.

-Tony

On 6 October 2010 14:18, Gregory Junker <***@dayark.com> wrote:

> >
> > I didn't mean to suggest that you shouldn't test performance during
> > testing, just that it shouldn't be your "first line of defense".
> > Obviously you can't fix problems before you notice them, but a lot of
> > these problems are entirely avoidable if you think things through early
> > enough.
>
> Which can easily lead to "analysis paralysis"...
>
> Greg
>
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>

Richard Fabian

2010-10-06 09:52:42 UTC

Permalink

I'm with Tony on this one: Too many times "just write it and we'll optimise
it later" has come to bite.

So many times it's been "just add a system to do this", and then three years
later, we're not using it the way it was originally intended due to
performance issues. For example, scripting languages that get too slow, get
compiled to code, then get optimised further by doing trickery, then some
random guy walks up and says, "so why not do this stuff in code anyway" and
gets bashed for pointing out the painful but obvious.

Don't trap yourself in thinking that something that seems easy to use, can
ever be made fast enough to USE. If you wait until your performance slot
(say one year before release), then you're in for an almighty headache as
you have to do awful things like change the way in which you even approach
certain problems. Take some visibility algorithms for example, how many
different ones are there? Portals, PVS, octree, BSP, blah blah blah.....
don't think they're all there because someone thought "ooh, i'lll try doing
it this way." No, they did it because in different circumstances, different
algorithms do better than others, either due to layout of your levels, or
layout of the code using it, or simply the access patterns into the
structure. For anyone who's actually done a complete conversion of visibilty
system from one to another, changing data structures and access techniques
throughout a codebase: I feel your pain.

There are many different ways to lay out your data, some are performant,
some are easy to manipulate... Not many are both.

Apply natural selection: If you don't care about performance from day 1,
then manipulability wins over performance. Thus ends the tale.

On 6 October 2010 05:12, Tony Albrecht <***@gmail.com> wrote:

> So you are recommending that you shouldn't think about the performance
> impact of your code before you write it?
>
> I think you should. If you are working on code which is likely to have a
> performance impact on a system, then you should think carefully about the
> layout of data and code. Smart decisions at the start will save you a lot of
> time in the long run. I've worked on systems that are incredibly difficult
> to optimise due to uniformly slow code where a week spent fiddling around
> during the design of the system would have saved months down the track.
>
> -Tony
>
>
> On 6 October 2010 14:18, Gregory Junker <***@dayark.com> wrote:
>
>> >
>> > I didn't mean to suggest that you shouldn't test performance during
>> > testing, just that it shouldn't be your "first line of defense".
>> > Obviously you can't fix problems before you notice them, but a lot of
>> > these problems are entirely avoidable if you think things through early
>> > enough.
>>
>> Which can easily lead to "analysis paralysis"...
>>
>> Greg
>>
>> _______________________________________________
>> Sweng-Gamedev mailing list
>> Sweng-***@lists.midnightryder.com
>>
>> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>>
>
>
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>
>

--
fabs();
Just because the world is full of people that think just like you, doesn't
mean the other ones can't be right.

Mat Noguchi

2010-10-06 16:53:20 UTC

Permalink

I think you overestimate anyone's ability to predict the actual difficulties of shipping and underestimate the ability of programmers to bash a system into submission. Unless you are talking about the mean and not possibilities.

MSN
From: sweng-gamedev-***@lists.midnightryder.com [mailto:sweng-gamedev-***@lists.midnightryder.com] On Behalf Of Richard Fabian
Sent: Wednesday, October 06, 2010 2:53 AM
To: sweng-***@midnightryder.com
Subject: Re: [Sweng-Gamedev] Data oriented programming

I'm with Tony on this one: Too many times "just write it and we'll optimise it later" has come to bite.

So many times it's been "just add a system to do this", and then three years later, we're not using it the way it was originally intended due to performance issues. For example, scripting languages that get too slow, get compiled to code, then get optimised further by doing trickery, then some random guy walks up and says, "so why not do this stuff in code anyway" and gets bashed for pointing out the painful but obvious.

Don't trap yourself in thinking that something that seems easy to use, can ever be made fast enough to USE. If you wait until your performance slot (say one year before release), then you're in for an almighty headache as you have to do awful things like change the way in which you even approach certain problems. Take some visibility algorithms for example, how many different ones are there? Portals, PVS, octree, BSP, blah blah blah..... don't think they're all there because someone thought "ooh, i'lll try doing it this way." No, they did it because in different circumstances, different algorithms do better than others, either due to layout of your levels, or layout of the code using it, or simply the access patterns into the structure. For anyone who's actually done a complete conversion of visibilty system from one to another, changing data structures and access techniques throughout a codebase: I feel your pain.

There are many different ways to lay out your data, some are performant, some are easy to manipulate... Not many are both.

Apply natural selection: If you don't care about performance from day 1, then manipulability wins over performance. Thus ends the tale.

On 6 October 2010 05:12, Tony Albrecht <***@gmail.com<mailto:***@gmail.com>> wrote:
So you are recommending that you shouldn't think about the performance impact of your code before you write it?

I think you should. If you are working on code which is likely to have a performance impact on a system, then you should think carefully about the layout of data and code. Smart decisions at the start will save you a lot of time in the long run. I've worked on systems that are incredibly difficult to optimise due to uniformly slow code where a week spent fiddling around during the design of the system would have saved months down the track.

-Tony

On 6 October 2010 14:18, Gregory Junker <***@dayark.com<mailto:***@dayark.com>> wrote:
>
> I didn't mean to suggest that you shouldn't test performance during
> testing, just that it shouldn't be your "first line of defense".
> Obviously you can't fix problems before you notice them, but a lot of
> these problems are entirely avoidable if you think things through early
> enough.
Which can easily lead to "analysis paralysis"...

Greg

_______________________________________________
Sweng-Gamedev mailing list
Sweng-***@lists.midnightryder.com<mailto:Sweng-***@lists.midnightryder.com>
http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com

_______________________________________________
Sweng-Gamedev mailing list
Sweng-***@lists.midnightryder.com<mailto:Sweng-***@lists.midnightryder.com>
http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com

--
fabs();
Just because the world is full of people that think just like you, doesn't mean the other ones can't be right.

Fabian Giesen

2010-10-06 18:53:03 UTC

Permalink

On 06.10.2010 09:53, Mat Noguchi wrote:
> I think you overestimate anyone’s ability to predict the actual
> difficulties of shipping and underestimate the ability of programmers to
> bash a system into submission. Unless you are talking about the mean and
> not possibilities.

Why do so many people on this list insist on turning this into a
dichotomy? It's not. I completely agree that no amount of planning ahead
will prevent you from having to change important parts of your codebase
late in the project because your assumptions turned out to be wrong. I
also agree trying to get everything perfect the first time round is a
fool's errand. (And I doubt anyone with software development experience
disagrees).

But I'm somewhat baffled that my suggestion of, in effect, "think before
you type" is actually being met with resistance. Just because I can't
fully predict the performance implications of a system in advance
doesn't mean it's a waste of my time to think about it at all. And, in
my experience anyway, *not* thinking about it in advance is a waste of
mine and other people's time some months down the road.

Same thing with the testing. Again, it's not a dichotomy. Just because I
think that testing is too late in the process to easily fix performance
problems doesn't mean that performance testing is worthless, it just
makes it a bad choice for your first line of defense.

-Fabian

Mat Noguchi

2010-10-06 19:02:23 UTC

Permalink

I was responding specifically to this:

> Apply natural selection: If you don't care about performance from day 1,
> then manipulability wins over performance. Thus ends the tale.

Which was stated by Richard Fabian, not you.

MSN
-----Original Message-----
From: sweng-gamedev-***@lists.midnightryder.com [mailto:sweng-gamedev-***@lists.midnightryder.com] On Behalf Of Fabian Giesen
Sent: Wednesday, October 06, 2010 11:53 AM
To: sweng-***@lists.midnightryder.com
Subject: Re: [Sweng-Gamedev] Data oriented programming

On 06.10.2010 09:53, Mat Noguchi wrote:
> I think you overestimate anyone's ability to predict the actual
> difficulties of shipping and underestimate the ability of programmers to
> bash a system into submission. Unless you are talking about the mean and
> not possibilities.

Why do so many people on this list insist on turning this into a
dichotomy? It's not. I completely agree that no amount of planning ahead
will prevent you from having to change important parts of your codebase
late in the project because your assumptions turned out to be wrong. I
also agree trying to get everything perfect the first time round is a
fool's errand. (And I doubt anyone with software development experience
disagrees).

But I'm somewhat baffled that my suggestion of, in effect, "think before
you type" is actually being met with resistance. Just because I can't
fully predict the performance implications of a system in advance
doesn't mean it's a waste of my time to think about it at all. And, in
my experience anyway, *not* thinking about it in advance is a waste of
mine and other people's time some months down the road.

Same thing with the testing. Again, it's not a dichotomy. Just because I
think that testing is too late in the process to easily fix performance
problems doesn't mean that performance testing is worthless, it just
makes it a bad choice for your first line of defense.

-Fabian

Richard

2010-10-06 22:40:51 UTC

Permalink

In article <***@gmx.net>,
Fabian Giesen <***@gmx.net> writes:

> Why do so many people on this list insist on turning this into a
> dichotomy? [...]

Ironically when I read your initial response to my suggestion of
having CI performance tests, this was my first thought.

Why does it have to be either/or, why can't it be both?

Why can't we think ahead but validate our thinking with CI testing?

> But I'm somewhat baffled that my suggestion of, in effect, "think before
> you type" is actually being met with resistance.

I don't perceive anyone as actually having expressed resistance to the
idea of thinking ahead.

I don't know if you feel this way or not, but I commonly see people
expressing the idea that agile development is incompatible with
thinking ahead or that "true" agile development somehow prohibits
thinking ahead. Nothing could be further from the truth. Maybe they
get this idea from the agile manifesto's statement that it values
"responding to change over following a plan". It doesn't say "do not
plan"; the agile manifesto was created in response to an environment
where people were drowning in specifications (i.e. plans) and yet
still building the wrong thing based on customer feedback.

In a gaming studio, "responding to change over following a plan" would
mean to me things like adjusting game play based on test play groups,
even if your original plan/architecture/whatever had something
different and it means ditching parts or all of your original
plan/architecture/whatever.

Maybe a game like "The Daedalus Encounter" probably wouldn't have been
such a piece of shit if they had responded to change instead of following
their plan.
--
"The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
<http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/>

Legalize Adulthood! <http://legalizeadulthood.wordpress.com>

Richard

2010-10-06 22:41:15 UTC

Permalink

Richard

2010-10-06 22:42:26 UTC

Permalink

My apologies for the duplicate send.
--
"The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
<http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/>

Legalize Adulthood! <http://legalizeadulthood.wordpress.com>

Richard Fabian

2010-10-06 20:44:42 UTC

Permalink

I'm not sure I actually understand what you've written there (especially the
bit about "mean and not possibilities",) but I think my point is orthogonal
to shipping issues. Normally, a game has either good performance or bad, and
that doesn't really affect whether or not it ships. Normally it only affects
what it is shipped with.

Bashing a system into submission is something I have done a number of times,
but sometimes we've had to go after smaller fish to bash as the real large
fish is just too hard to make good. If you don't consider the difficulty of
future optimisations, then you run the risk of having to redo part of the
project.

To me, it sounds like you're happy ripping systems out to replace them, and
that would make sense if you're working on the same kind of code base game
after game, things change, but everything kinda stays the same. That's
great, you can throw away whole ways of working and introduce new stuff and
see how it works, but when you're doing multiple projects in the same studio
without much overlapping tech or at least without much overlapping
requirements, your estimated schedules are wild and inaccurate because each
project is like you've never done a project like it before. Putting
performance estimations off and relying on late in the game optimisation
will get you a game, but not one that is great. Something will suffer.

One last point, performance is one of those things that fuels it's own fire.
Like asset build optimisations or iteration time reduction. Every time you
make one of these things better, it makes the whole studio run faster.

On 6 October 2010 17:53, Mat Noguchi <***@bungie.com> wrote:

> I think you overestimate anyones ability to predict the actual
> difficulties of shipping and underestimate the ability of programmers to
> bash a system into submission. Unless you are talking about the mean and not
> possibilities.
>
>
>
> MSN
>
> *
> *
>

--
fabs();
Just because the world is full of people that think just like you, doesn't
mean the other ones can't be right.

Mat Noguchi

2010-10-06 22:33:33 UTC

Permalink

I think my statement was terse and subtle enough to allow for pretty much any interpretation (apparently mostly negative). So let's expand:

I interpreted your post as "you need to design your system with <x> in mind up front." If that's not the case then feel free to ignore my initial reply.

Although based on your reply, I don't think my retort is even necessary, other than to say it's not that easy to understand the performance impact of what you do; it's much more obvious to understand performance from the perspective of what you shouldn't do. It's a very subtle distinction, and doesn't really clash with anything you said.

[And I would also argue that simply "planning early on for performance" isn't really sufficient unless the "plan" is: don't ever write code that will make performance crappy.]

MSN
From: sweng-gamedev-***@lists.midnightryder.com [mailto:sweng-gamedev-***@lists.midnightryder.com] On Behalf Of Richard Fabian
Sent: Wednesday, October 06, 2010 1:45 PM
To: sweng-***@midnightryder.com
Subject: Re: [Sweng-Gamedev] Data oriented programming

I'm not sure I actually understand what you've written there (especially the bit about "mean and not possibilities",) but I think my point is orthogonal to shipping issues. Normally, a game has either good performance or bad, and that doesn't really affect whether or not it ships. Normally it only affects what it is shipped with.

Bashing a system into submission is something I have done a number of times, but sometimes we've had to go after smaller fish to bash as the real large fish is just too hard to make good. If you don't consider the difficulty of future optimisations, then you run the risk of having to redo part of the project.

To me, it sounds like you're happy ripping systems out to replace them, and that would make sense if you're working on the same kind of code base game after game, things change, but everything kinda stays the same. That's great, you can throw away whole ways of working and introduce new stuff and see how it works, but when you're doing multiple projects in the same studio without much overlapping tech or at least without much overlapping requirements, your estimated schedules are wild and inaccurate because each project is like you've never done a project like it before. Putting performance estimations off and relying on late in the game optimisation will get you a game, but not one that is great. Something will suffer.

One last point, performance is one of those things that fuels it's own fire. Like asset build optimisations or iteration time reduction. Every time you make one of these things better, it makes the whole studio run faster.

On 6 October 2010 17:53, Mat Noguchi <***@bungie.com<mailto:***@bungie.com>> wrote:
I think you overestimate anyone's ability to predict the actual difficulties of shipping and underestimate the ability of programmers to bash a system into submission. Unless you are talking about the mean and not possibilities.

MSN

--
fabs();
Just because the world is full of people that think just like you, doesn't mean the other ones can't be right.

Jon Watte

2010-10-10 20:39:03 UTC

Permalink

There are a few differing points of view here.
Clearly, you can't super-optimize all code you write up front, because you
don't know that that's going to be the part that matters.

Clearly, *some* code will almost always matter -- the core of a CPU skinning
loop, say, or broad-phase collision detection. Will the performance of a
script interpreter matter? If 95% of the heavy lifting is done in the
rendering and simulation subsystems anyway, then probably not. (Not to
mention that you *should* be using Lua or Python or Mono or some other
already-debugged scripting facility :-)

You can't always know which will end up mattering, and you can't assume that
everything will matter, or you'll end up wasting a lot of expensive
programmer time. Thus, you take a good stab at guessing where it's worth it,
and move on. A CI performance tester (which I think is what this thread is
about?) is a good safety net for when you get it wrong.

Then there's the question of "bashing it into shape" as a phase of shipping.
If what you're shipping is a game, on physical (or near-physical) media,
then spending a few months to make it great, once you know that the design
of the game is solid, makes a lot of sense.
If what you're shipping is a platform or a service, you don't have that
opportunity.
I'm working at a place where we synthesize a very dynamic web site, a
semi-real-time game server back-end, and a end-user-installable 3D client
application. We ship 50 times a day. This allows us to react to changes in
the market and tune for customers with incredible speed! If we were to stop
and not ship anything new for six months, we'd probably be dead.
Thus, we take a guess at performance up front, and then let CI performance
tests, as well as measurements collected from actual customer installs, tell
us what areas we need to optimize.

Sincerely,

jw

--
Americans might object: there is no way we would sacrifice our living
standards for the benefit of people in the rest of the world. Nevertheless,
whether we get there willingly or not, we shall soon have lower consumption
rates, because our present rates are unsustainable.

On Wed, Oct 6, 2010 at 2:52 AM, Richard Fabian <***@gmail.com> wrote:

> I'm with Tony on this one: Too many times "just write it and we'll optimise
> it later" has come to bite.
>
> So many times it's been "just add a system to do this", and then three
> years later, we're not using it the way it was originally intended due to
> performance issues. For example, scripting languages that get too slow, get
> compiled to code, then get optimised further by doing trickery, then some
> random guy walks up and says, "so why not do this stuff in code anyway" and
> gets bashed for pointing out the painful but obvious.
>
> Don't trap yourself in thinking that something that seems easy to use, can
> ever be made fast enough to USE. If you wait until your performance slot
> (say one year before release), then you're in for an almighty headache as
> you have to do awful things like change the way in which you even approach
> certain problems. Take some visibility algorithms for example, how many
> different ones are there? Portals, PVS, octree, BSP, blah blah blah.....
> don't think they're all there because someone thought "ooh, i'lll try doing
> it this way." No, they did it because in different circumstances, different
> algorithms do better than others, either due to layout of your levels, or
> layout of the code using it, or simply the access patterns into the
> structure. For anyone who's actually done a complete conversion of visibilty
> system from one to another, changing data structures and access techniques
> throughout a codebase: I feel your pain.
>
> There are many different ways to lay out your data, some are performant,
> some are easy to manipulate... Not many are both.
>
> Apply natural selection: If you don't care about performance from day 1,
> then manipulability wins over performance. Thus ends the tale.
>
>
> On 6 October 2010 05:12, Tony Albrecht <***@gmail.com> wrote:
>
>> So you are recommending that you shouldn't think about the performance
>> impact of your code before you write it?
>>
>> I think you should. If you are working on code which is likely to have a
>> performance impact on a system, then you should think carefully about the
>> layout of data and code. Smart decisions at the start will save you a lot of
>> time in the long run. I've worked on systems that are incredibly difficult
>> to optimise due to uniformly slow code where a week spent fiddling around
>> during the design of the system would have saved months down the track.
>>
>> -Tony
>>
>>
>> On 6 October 2010 14:18, Gregory Junker <***@dayark.com> wrote:
>>
>>> >
>>> > I didn't mean to suggest that you shouldn't test performance during
>>> > testing, just that it shouldn't be your "first line of defense".
>>> > Obviously you can't fix problems before you notice them, but a lot of
>>> > these problems are entirely avoidable if you think things through early
>>> > enough.
>>>
>>> Which can easily lead to "analysis paralysis"...
>>>
>>> Greg
>>>
>>> _______________________________________________
>>> Sweng-Gamedev mailing list
>>> Sweng-***@lists.midnightryder.com
>>>
>>> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>>>
>>
>>
>> _______________________________________________
>> Sweng-Gamedev mailing list
>> Sweng-***@lists.midnightryder.com
>>
>> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>>
>>
>
>
> --
> fabs();
> Just because the world is full of people that think just like you, doesn't
> mean the other ones can't be right.
>
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>
>

Veikko Eeva

2010-10-11 18:40:49 UTC

Permalink

Jon Watte wrote:
> There are a few differing points of view here.
> Clearly, you can't super-optimize all code you write up front, because
> you don't know that that's going to be the part that matters.
>[...]

This isn't about games, but I couldn't resist on providing a link to a
Velocity 2010 conference presentation notes by Theo Schlossnagle of
Scalable Internet Architectures fame about, well, Scalable
Internet Architectures:
http://assets.en.oreilly.com/1/event/44/Scalable%20Internet%20Architectures%20Presentation%202.pdf

After the intial hand-waving, the presentation is much about
understanding the problem in the context of performance and scalability
trade-offs.

Cheers,
Veikko E.

P.s.
Velocity conference 2010 Slides/Notes
http://www.royans.net/arch/all-velocity-conference-2010-slidesnotes/

Continuous deployments may not be for everyone: Culture
http://www.royans.net/arch/continuous-deployments-may-not-be-for-everyone-culture/

OvermindDL1

2010-10-16 02:11:03 UTC

Permalink

On Mon, Oct 11, 2010 at 12:40 PM, Veikko Eeva <***@iki.fi> wrote:
>
> Jon Watte wrote:
>>
>> There are a few differing points of view here.
>> Clearly, you can't super-optimize all code you write up front, because
>> you don't know that that's going to be the part that matters.
>> [...]
>
> This isn't about games, but I couldn't resist on providing a link to a
> Velocity 2010 conference presentation notes by Theo Schlossnagle of Scalable
> Internet Architectures fame about, well, Scalable
> Internet Architectures:
> http://assets.en.oreilly.com/1/event/44/Scalable%20Internet%20Architectures%20Presentation%202.pdf
>
> After the intial hand-waving, the presentation is much about understanding
> the problem in the context of performance and scalability trade-offs.
>
>
> Cheers,
> Veikko E.
>
>
> P.s.
> Velocity conference 2010 Slides/Notes
> http://www.royans.net/arch/all-velocity-conference-2010-slidesnotes/
>
> Continuous deployments may not be for everyone: Culture
> http://www.royans.net/arch/continuous-deployments-may-not-be-for-everyone-culture/

Continuing this line of questioning, this is actually my use-case.

I have a multi-dimensional array, say typedef myType[SIZE][SIZE][SIZE]
myTypeArr;, the SIZE is set right now but I can change it to whatever
allows best access. Right now to fit in the L2 cache properly on most
CPU's I have it SIZE set to 8, which as I recall lets it be about a
24k block.

myType itself, in essence, is a discriminated union, you can think of
it as a tag and a data, like this:
struct myType {
size_unsigned short tag;
void *data;
}
(In reality is reserves enough internal space to hold the data of the
'tag' for most types, others have a void* at the end of the allocated
area for additional data, where the most accessed is in the main area,
not through the pointer, I think this was 31bytes in size, maybe 63,
would have to look, but that can be changed as well to a completely
different way if necessary.)

The tag either references a built-in type, or it 0 it references a
scripted type (which I usually later plan to make into internal types
for speed reason, but the scripted type will remain for third-party
extensions). There are two 'major' passes which are called the most
often, each of which access two different types of data inside the
data element, so I could easily create a function to work over all of
these in quick turn. However, there is a lot more sporadic access,
for example, say tag type 5 needs to have a certain function run over
it when a certain event happens, or more simply, every one whole
second. It is not just tag type 5, but a lot of tag types that will
need to perform actions every once in a while, and usually on *all* of
a certain tag type within bounds in the overall structure. I figured
that the brute-force method would be to have a pointer for each
possible tag type, and as a tag is allocated just have the main
structure keep the 'coords' of it with a pointer to that index of the
tag type in another allocated structure, something like this:
struct myTypeArr {
myType arr[SIZE][SIZE][SIZE]; // where mytype is a tag and an index into:

void **internalTags[TAGCOUNT];
}
and access it like:
myTypeArr a;
a.internalTags[a.arr[3][6][1].tag][a.arr[3][6][1].idx].data // or
.location or whatever
Or something nasty like that. problem is, that only uses memory for
an item when it exists and is marked zero otherwise, but this will
still eat a *LOT* more memory then otherwise due to the multiple void*
indirections, and of course each pointer is going to be using 4/8
bytes of memory each, although since the tag size of 'mostly' known
(very few have variable sizes), that could simplify memory allocation
somewhat.

The overall structure as-is (although it can be pretty easily changed,
the back-end is pretty well abstracted out from the front) is an
octree, the octree can currently handle a max size of 32bits x 32bits
x 32bits, although reduced a little based on the above SIZE constant
(if size is 8 then the 32 bits is reduced to 29 bits or so).
Basically, I need to index into a *massive*, not always loaded array
that is 32bits x 32bits x 32bits in size *just* for indexing, not
including data, and the data per cell may be variable size, but in
general nearby chunks tend to be the same. So it is abstracted out as
an octree, if all blocks in a level of the octree are all the same
then the lower levels are deleted from memory and it is given as just
a single block representing all blocks in the upper level, so on and
so forth.

The way the data is generated comes from a generator function. Upon
first access of a data block, its 'chunk' is loaded, and the chunk is
loaded by giving the location and dimension of the chunk to a
generator function that fills it up with data (dynamically generated,
scripted, who knows, the function is a black-box). When the data
chunk is no longer needed for a time period then it is unloaded from
memory by being serialized out to disk (whether by just writing a file
with a special filename based on location and block size, or being
written to a DB, or, how it is done now, being written to a
distributed filesystem using Sector/Sphere, all subject to change
based on if a better method is found). Some chunks may almost always
be loaded into memory due to rapid access on them, potentially growing
large to to a large number of chunks being loaded. Currently I am
writing it into Sector/Sphere so I can distribute it, but thus far I
have only distributed the file reading/writing of chunks and not of
processing.

I am guessing that I could distribute the processing and let each
Sector/Sphere slave work on entire chunks at a time and just do a
giant switch on the tag and work on each block inside the chunks, a
couple problems with that though, the primary two being that blocks
can affect other nearby blocks (including nearby ones in nearby
chunks), could probably work around this by either overlapping chunks
and having 'out of range' chunks be read-only, but still affect the
nearby internal ones to the chunk, that would duplicate processing
though, and I would have to make sure that data access was only
affected by chunks of a very limited range while also ensuring that
they only propagate outwards and not inwards, or have all affects
propogate outwards in another pass to handle such things, thus slowing
it down by causing another access pass, although that could be
mitigated by just running it concurrently with the next 'tick' pass
inside it, but would could potentially double memory usage (which I
guess is easier to access then faster CPU usage...).

So, I want to minimize latency in the tick processing, including
minimizing reading from the drive, should I try to figure out a way to
maximize concurrent processing on this 6-core server, or perhaps
should I go ahead and scale out to Sector/Sphere to let slave
computers do a lot of the processing.

Hmm, an idea, I could do all of the processing of the nearby nodes to
the players on the 6-core server itself, and do the further nodes on
the slaves, I do not need it to be deterministic (although it would be
*VERY* nice if it were)...

I could just use Sector/Sphere for storage and create multiple client
server to work on the data in tandom, say in specific in-game
geographical areas, and have clients connect to the server that
handles the area closest to them, perhaps be connected the the
multiple nearest servers at the same time for quick access...

This might be the best bet as it will not be handling just 'one' world
either, hmm...

I still need to figure out how to optimize the primary server work
though, is the best method just to work on a block by block basis
while switching based on the tag, or should that try to be optimized
somehow to minimize branches in the functions? Perhaps just a master
array list and go over those? But I do not want to update every block
in every chunk in every tick. For example, most things will only need
to update near every tick in only the chunks that are near players,
further away ones can update slower if necessary. Optimally
everything would update every tick, but that will no doubt be
infeasible when it gets to a larger dataset. Thoughts? It seems like
a problem that should be very simple to use such a data-oriented
pattern, except that there are going to be a rather large amount of
tag types (a short should handle it well, but it is not altogether
impossible for me to need to increase it to a uint32 later), each of
which may operate differently.

Hmm, another idea, most blocks will require no internal state if I
split their small amount of stated into new block types, but that will
quickly balloon the block type count, and there will still be a
comparatively small amount of blocks that do require state, very few
with rather large state.

Jon Watte

2010-10-17 15:36:09 UTC

Permalink

tl;dr:

Either, you walk the data in memory order, and switch on type, or you group
the data in type order, and do all types separately.

In the first case, you get branch mis-predictions, which can be expensive,
and you also get higher I-cache pressure and even D-cache pressure because
you have to keep a bunch of different "subsystems" (type handlers?) in
memory.

In the second case, you have to map xyz->object (and object->xyz) in some
way other than the three-dimensional array, such as a sparse array
implemented as a hash table. That will cause more cache misses when locating
the item for any particular element, but it will let you walk elements of
the same type using linear memory access, and have less I and D cache
pressure and better branch prediction.

Which one is faster depends on whether you do mostly "update all of type X"
or do mostly "find element by xyz" operations per frame. You'll simply have
to come up with realistic workloads and profile it both ways.

Sincerely,

jw

--
Americans might object: there is no way we would sacrifice our living
standards for the benefit of people in the rest of the world. Nevertheless,
whether we get there willingly or not, we shall soon have lower consumption
rates, because our present rates are unsustainable.

On Fri, Oct 15, 2010 at 7:11 PM, OvermindDL1 <***@gmail.com> wrote:

> On Mon, Oct 11, 2010 at 12:40 PM, Veikko Eeva <***@iki.fi> wrote:
> >
> > Jon Watte wrote:
> >>
> >> There are a few differing points of view here.
> >> Clearly, you can't super-optimize all code you write up front, because
> >> you don't know that that's going to be the part that matters.
> >> [...]
> >
> > This isn't about games, but I couldn't resist on providing a link to a
> > Velocity 2010 conference presentation notes by Theo Schlossnagle of
> Scalable
> > Internet Architectures fame about, well, Scalable
> > Internet Architectures:
> >
> http://assets.en.oreilly.com/1/event/44/Scalable%20Internet%20Architectures%20Presentation%202.pdf
> >
> > After the intial hand-waving, the presentation is much about
> understanding
> > the problem in the context of performance and scalability trade-offs.
> >
> >
> > Cheers,
> > Veikko E.
> >
> >
> > P.s.
> > Velocity conference 2010 Slides/Notes
> > http://www.royans.net/arch/all-velocity-conference-2010-slidesnotes/
> >
> > Continuous deployments may not be for everyone: Culture
> >
> http://www.royans.net/arch/continuous-deployments-may-not-be-for-everyone-culture/
>
>
> Continuing this line of questioning, this is actually my use-case.
>
> I have a multi-dimensional array, say typedef myType[SIZE][SIZE][SIZE]
> myTypeArr;, the SIZE is set right now but I can change it to whatever
> allows best access. Right now to fit in the L2 cache properly on most
> CPU's I have it SIZE set to 8, which as I recall lets it be about a
> 24k block.
>
> myType itself, in essence, is a discriminated union, you can think of
> it as a tag and a data, like this:
> struct myType {
> size_unsigned short tag;
> void *data;
> }
> (In reality is reserves enough internal space to hold the data of the
> 'tag' for most types, others have a void* at the end of the allocated
> area for additional data, where the most accessed is in the main area,
> not through the pointer, I think this was 31bytes in size, maybe 63,
> would have to look, but that can be changed as well to a completely
> different way if necessary.)
>
> The tag either references a built-in type, or it 0 it references a
> scripted type (which I usually later plan to make into internal types
> for speed reason, but the scripted type will remain for third-party
> extensions). There are two 'major' passes which are called the most
> often, each of which access two different types of data inside the
> data element, so I could easily create a function to work over all of
> these in quick turn. However, there is a lot more sporadic access,
> for example, say tag type 5 needs to have a certain function run over
> it when a certain event happens, or more simply, every one whole
> second. It is not just tag type 5, but a lot of tag types that will
> need to perform actions every once in a while, and usually on *all* of
> a certain tag type within bounds in the overall structure. I figured
> that the brute-force method would be to have a pointer for each
> possible tag type, and as a tag is allocated just have the main
> structure keep the 'coords' of it with a pointer to that index of the
> tag type in another allocated structure, something like this:
> struct myTypeArr {
> myType arr[SIZE][SIZE][SIZE]; // where mytype is a tag and an index into:
>
> void **internalTags[TAGCOUNT];
> }
> and access it like:
> myTypeArr a;
> a.internalTags[a.arr[3][6][1].tag][a.arr[3][6][1].idx].data // or
> .location or whatever
> Or something nasty like that. problem is, that only uses memory for
> an item when it exists and is marked zero otherwise, but this will
> still eat a *LOT* more memory then otherwise due to the multiple void*
> indirections, and of course each pointer is going to be using 4/8
> bytes of memory each, although since the tag size of 'mostly' known
> (very few have variable sizes), that could simplify memory allocation
> somewhat.
>
>
> The overall structure as-is (although it can be pretty easily changed,
> the back-end is pretty well abstracted out from the front) is an
> octree, the octree can currently handle a max size of 32bits x 32bits
> x 32bits, although reduced a little based on the above SIZE constant
> (if size is 8 then the 32 bits is reduced to 29 bits or so).
> Basically, I need to index into a *massive*, not always loaded array
> that is 32bits x 32bits x 32bits in size *just* for indexing, not
> including data, and the data per cell may be variable size, but in
> general nearby chunks tend to be the same. So it is abstracted out as
> an octree, if all blocks in a level of the octree are all the same
> then the lower levels are deleted from memory and it is given as just
> a single block representing all blocks in the upper level, so on and
> so forth.
>
> The way the data is generated comes from a generator function. Upon
> first access of a data block, its 'chunk' is loaded, and the chunk is
> loaded by giving the location and dimension of the chunk to a
> generator function that fills it up with data (dynamically generated,
> scripted, who knows, the function is a black-box). When the data
> chunk is no longer needed for a time period then it is unloaded from
> memory by being serialized out to disk (whether by just writing a file
> with a special filename based on location and block size, or being
> written to a DB, or, how it is done now, being written to a
> distributed filesystem using Sector/Sphere, all subject to change
> based on if a better method is found). Some chunks may almost always
> be loaded into memory due to rapid access on them, potentially growing
> large to to a large number of chunks being loaded. Currently I am
> writing it into Sector/Sphere so I can distribute it, but thus far I
> have only distributed the file reading/writing of chunks and not of
> processing.
>
> I am guessing that I could distribute the processing and let each
> Sector/Sphere slave work on entire chunks at a time and just do a
> giant switch on the tag and work on each block inside the chunks, a
> couple problems with that though, the primary two being that blocks
> can affect other nearby blocks (including nearby ones in nearby
> chunks), could probably work around this by either overlapping chunks
> and having 'out of range' chunks be read-only, but still affect the
> nearby internal ones to the chunk, that would duplicate processing
> though, and I would have to make sure that data access was only
> affected by chunks of a very limited range while also ensuring that
> they only propagate outwards and not inwards, or have all affects
> propogate outwards in another pass to handle such things, thus slowing
> it down by causing another access pass, although that could be
> mitigated by just running it concurrently with the next 'tick' pass
> inside it, but would could potentially double memory usage (which I
> guess is easier to access then faster CPU usage...).
>
> So, I want to minimize latency in the tick processing, including
> minimizing reading from the drive, should I try to figure out a way to
> maximize concurrent processing on this 6-core server, or perhaps
> should I go ahead and scale out to Sector/Sphere to let slave
> computers do a lot of the processing.
>
> Hmm, an idea, I could do all of the processing of the nearby nodes to
> the players on the 6-core server itself, and do the further nodes on
> the slaves, I do not need it to be deterministic (although it would be
> *VERY* nice if it were)...
>
> I could just use Sector/Sphere for storage and create multiple client
> server to work on the data in tandom, say in specific in-game
> geographical areas, and have clients connect to the server that
> handles the area closest to them, perhaps be connected the the
> multiple nearest servers at the same time for quick access...
>
> This might be the best bet as it will not be handling just 'one' world
> either, hmm...
>
> I still need to figure out how to optimize the primary server work
> though, is the best method just to work on a block by block basis
> while switching based on the tag, or should that try to be optimized
> somehow to minimize branches in the functions? Perhaps just a master
> array list and go over those? But I do not want to update every block
> in every chunk in every tick. For example, most things will only need
> to update near every tick in only the chunks that are near players,
> further away ones can update slower if necessary. Optimally
> everything would update every tick, but that will no doubt be
> infeasible when it gets to a larger dataset. Thoughts? It seems like
> a problem that should be very simple to use such a data-oriented
> pattern, except that there are going to be a rather large amount of
> tag types (a short should handle it well, but it is not altogether
> impossible for me to need to increase it to a uint32 later), each of
> which may operate differently.
>
> Hmm, another idea, most blocks will require no internal state if I
> split their small amount of stated into new block types, but that will
> quickly balloon the block type count, and there will still be a
> comparatively small amount of blocks that do require state, very few
> with rather large state.
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>

OvermindDL1

2010-10-17 23:45:30 UTC

Permalink

On Sun, Oct 17, 2010 at 9:36 AM, Jon Watte <***@gmail.com> wrote:
> tl;dr:
> Either, you walk the data in memory order, and switch on type, or you group
> the data in type order, and do all types separately.
> In the first case, you get branch mis-predictions, which can be expensive,
> and you also get higher I-cache pressure and even D-cache pressure because
> you have to keep a bunch of different "subsystems" (type handlers?) in
> memory.
> In the second case, you have to map xyz->object (and object->xyz) in some
> way other than the three-dimensional array, such as a sparse array
> implemented as a hash table. That will cause more cache misses when locating
> the item for any particular element, but it will let you walk elements of
> the same type using linear memory access, and have less I and D cache
> pressure and better branch prediction.
> Which one is faster depends on whether you do mostly "update all of type X"
> or do mostly "find element by xyz" operations per frame. You'll simply have
> to come up with realistic workloads and profile it both ways.
> Sincerely,

That is what I figured, problem is I am going to be doing a lot of
operations all on single types within a large area, and also going to
be getting everything in range to send to the client often as well as
their view area changes and on how blocks can affect other nearby
blocks... Basically create both systems I guess and benchmark, how
fun...

Jon Watte

2010-10-19 20:36:23 UTC

Permalink

Basically create both systems I guess and benchmark, how
fun.

Or create one, and use it until it becomes a blocking problem, and then try
to benchmark the other to see if it will be any better.
No project has infinite time and resources, and pretending that they do is
directly hurtful to the success of the project.

If this is the biggest risk in your entire project, then creating both
systems and measuring them may make sense. If this is just a small risk, or
a perfectionist wanting to avoid wasting 5% of the CPU, then just implement
the simplest thing that could possibly work, write tests that prove that
it's functionally correct, and don't re-visit the code until BOTH conditions
are true:
1) You are not reaching your target frame rate on your target hardware
2) A profiler tells you that this system is the lowest hanging fruit for
performance

It's OK to plan up front, and even compare the two approaches. If you know
that obviously one of the factors will dominate, planning says that's what
you should implement for. If you think it'll be a toss-up, then just pick
the quickest one to implement, and go with that.

Sincerely,

jw

--
Americans might object: there is no way we would sacrifice our living
standards for the benefit of people in the rest of the world. Nevertheless,
whether we get there willingly or not, we shall soon have lower consumption
rates, because our present rates are unsustainable.

On Sun, Oct 17, 2010 at 4:45 PM, OvermindDL1 <***@gmail.com> wrote:

> On Sun, Oct 17, 2010 at 9:36 AM, Jon Watte <***@gmail.com> wrote:
> > tl;dr:
> > Either, you walk the data in memory order, and switch on type, or you
> group
> > the data in type order, and do all types separately.
> > In the first case, you get branch mis-predictions, which can be
> expensive,
> > and you also get higher I-cache pressure and even D-cache pressure
> because
> > you have to keep a bunch of different "subsystems" (type handlers?) in
> > memory.
> > In the second case, you have to map xyz->object (and object->xyz) in some
> > way other than the three-dimensional array, such as a sparse array
> > implemented as a hash table. That will cause more cache misses when
> locating
> > the item for any particular element, but it will let you walk elements of
> > the same type using linear memory access, and have less I and D cache
> > pressure and better branch prediction.
> > Which one is faster depends on whether you do mostly "update all of type
> X"
> > or do mostly "find element by xyz" operations per frame. You'll simply
> have
> > to come up with realistic workloads and profile it both ways.
> > Sincerely,
>
> That is what I figured, problem is I am going to be doing a lot of
> operations all on single types within a large area, and also going to
> be getting everything in range to send to the client often as well as
> their view area changes and on how blocks can affect other nearby
> blocks... Basically create both systems I guess and benchmark, how
> fun...
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>

Marc OMorain

2010-10-21 08:56:55 UTC

Permalink

>
> don't re-visit the code until BOTH conditions are true:
> 1) You are not reaching your target frame rate on your target hardware
> 2) A profiler tells you that this system is the lowest hanging fruit for
> performance
>
>
This advice should be engraved in stone. Very well said, Sir.

Richard Fabian

2010-10-21 09:22:05 UTC

Permalink

I'm not sure I agree though. Low hanging fruit is easy, but sometimes it's
better to chop the tree down to get all the fruit at once.
Right now, concentrating on the input and output data and then writing just
what's necessary to transform the data seems like a silver axe.

On 21 October 2010 09:56, Marc OMorain <***@kore.net> wrote:

> don't re-visit the code until BOTH conditions are true:
>> 1) You are not reaching your target frame rate on your target hardware
>> 2) A profiler tells you that this system is the lowest hanging fruit for
>> performance
>>
>>
> This advice should be engraved in stone. Very well said, Sir.
>
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>
>

--
fabs();
Just because the world is full of people that think just like you, doesn't
mean the other ones can't be right.

Gregory Junker

2010-10-06 20:48:40 UTC

Permalink

So you are recommending that you shouldn't think about the performance
impact of your code before you write it?

I don't see how you could possibly have read that into my comments.

Greg

Gregory Junker

2010-10-06 20:51:18 UTC

Permalink

Sorry, thanks to Outlook assuming everyone wants to write HTML in their
emails, this isn't clear; edited for who said what.

> Tony said:
> So you are recommending that you shouldn't think about the performance
> impact of your code before you write it?
>

I said:

I don't see how you could possibly have read that into my comments.

Greg

Tony Albrecht

2010-10-06 22:47:09 UTC

Permalink

Fabian wrote:
> I didn't mean to suggest that you shouldn't test performance during
testing, just that it shouldn't be
> your "first line of defense". Obviously you can't fix problems before you
notice them, but a lot of these
> problems are entirely avoidable if you think things through early enough.

Then Gregory wrote:
> Which can easily lead to "analysis paralysis"...

To which I replied:
> So you are recommending that you shouldn't think about the performance
> impact of your code before you write it?

And the Gregory replied:
> I don't see how you could possibly have read that into my comments.

Your terse comment lead me to think that you didn't like the idea of
considering performance before implementation as it could lead to analysis
paralysis. If you were just making a statement about analysis paralysis
then, I agree, yes it can. But the fear of analysis paralysis (analysis
paralysis paralysis?) is more likely to have a detrimental long term effect
on your code base than the former.

-Tony

On 7 October 2010 07:21, Gregory Junker <***@dayark.com> wrote:

> Sorry, thanks to Outlook assuming everyone wants to write HTML in their
> emails, this isn't clear; edited for who said what.
>
> > Tony said:
> > So you are recommending that you shouldn't think about the performance
> > impact of your code before you write it?
> >
>
> I said:
>
> I don't see how you could possibly have read that into my comments.
>
> Greg
>
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>

Gregory Junker

2010-10-07 02:44:40 UTC

Permalink

From: sweng-gamedev-***@lists.midnightryder.com
[mailto:sweng-gamedev-***@lists.midnightryder.com] On Behalf Of Tony
Albrecht
Sent: Wednesday, October 06, 2010 3:47 PM
To: sweng-***@midnightryder.com
Subject: Re: [Sweng-Gamedev] Data oriented programming

> Your terse comment lead me to think that you didn't like the
> idea of considering performance before implementation as it
> could lead to analysis paralysis. If you were just making a
> statement about analysis paralysis then, I agree, yes it can.
> But the fear of analysis paralysis (analysis paralysis paralysis?)
> is more likely to have a detrimental long term effect on your code
> base than the former.

> -Tony

My point is that at some point you have to start writing code. Too often I
see engineers afraid to write a single line because they have not thought
through (and therefore designed for) every single possibility in advance.
Nothing we talk about on this list rises to the same level of coding for
space shuttle or Mars missions, so while considering algorithmic complexity
or not making the obvious performance mistakes during implementation are
Good Things (as has been implied previously), a product needs to ship, and
IMO it really shouldn't go much beyond that before you start writing code.

The fact is that pretty much everyone here has already done before, whatever
it is they are embarking upon, so experience goes a long way towards
encouraging things that improve performance, even without a ton of thought
up front.

I suppose that wasn't exactly conveyed by a one-liner...

Greg

Gabriel Sassone

2010-10-07 09:54:50 UTC

Permalink

Personally I think that it's difficult to find a balance between writing
code, knowing performances from both a large and small scale, think what you
have NOT to do...it's a matter of experience and open mind/study.
I don't like the "coding" paralysis that many software engineering have (me
included until some time ago), because having "fear" of writing code because
it's not too much clean/with good performance can lead to writing nothing
(that is far worst).
Also I don't like the pattern mentality, because many times it's more a
matter of context and circumstances/experience in the code you write.
So for me, I started giving me a "decision time", after which I'll start
writing code. Then I'm writing down what can be improved, both from the
performances that from the interface design, and features: in that way I'm
finding a balance between cowboy coding and code paralysis.

Just my 50c, even if I'm not experienced as you, I wanted only to hear your
experiences with Data Oriented Programming.

By the way, thanks anybody for the replies!

Gabriel Sassone

OvermindDL1

2010-10-08 01:11:42 UTC

Permalink

On Thu, Oct 7, 2010 at 3:54 AM, Gabriel Sassone <***@gmail.com> wrote:
> /* snip */

Slightly back on to topic, data oriented programming, I know what most
of it means anyway and I do use it in some specific areas, but I am
curious, a common use-case for a game:

Class hierarchy:
GameObject
Actor
Pawn
PlayerPawn
Vehicle
Wheeled
Tracked
Flying
Controller
PlayerController
AIController
MonsterController

I am not certain how to properly get functionality over to a
data-oriented model. Just for the sake of ease-of-use argument, the
best way might be a component system? But it seems that no matter how
I think of data usage, sometimes I will need, for example, the
positions with the force descriptions, or I might need positions with
the renderer, or I might need the force descriptions with the input to
each model (and how do you model an input system based on events, such
as you have a list of events and based on what they are they need to
modify the structures of one thing per event, with possible multiple
events per thing, or maybe zero).

I have done quite a bit of looking on Google for any articles or
tutorials that anyone may have done to convert a traditionally
C++/Java-style OO hierarchy into a data-oriented or so setup for
maximal speed, does anyone know of one?

Phill Djonov

2010-10-08 08:37:31 UTC

Permalink

Go with components and limit cross-component communication to clearly
defined phases:

TickAnims();
MoveAnimDataToPhysics();
TickPhysics();
MovePhysicsDataToAnimation();
TouchUpAnimsForThingsPhysicsConstrained();

Etc.

Obviously, a lot of the move phases go away if you can find a
cache-friendly format that works well for most members of a group of
related subsystems, though often the cost of copying data around is
well worth the savings you can achieve when a complex computation is
tuned *just* right.

YMMV.

Phill

On Thu, Oct 7, 2010 at 7:11 PM, OvermindDL1 <***@gmail.com> wrote:
> On Thu, Oct 7, 2010 at 3:54 AM, Gabriel Sassone <***@gmail.com> wrote:
>> /* snip */
>
> Slightly back on to topic, data oriented programming, I know what most
> of it means anyway and I do use it in some specific areas, but I am
> curious, a common use-case for a game:
>
> Class hierarchy:
> GameObject
> Actor
> Pawn
> PlayerPawn
> Vehicle
> Wheeled
> Tracked
> Flying
> Controller
> PlayerController
> AIController
> MonsterController
>
> I am not certain how to properly get functionality over to a
> data-oriented model. Just for the sake of ease-of-use argument, the
> best way might be a component system? But it seems that no matter how
> I think of data usage, sometimes I will need, for example, the
> positions with the force descriptions, or I might need positions with
> the renderer, or I might need the force descriptions with the input to
> each model (and how do you model an input system based on events, such
> as you have a list of events and based on what they are they need to
> modify the structures of one thing per event, with possible multiple
> events per thing, or maybe zero).
>
> I have done quite a bit of looking on Google for any articles or
> tutorials that anyone may have done to convert a traditionally
> C++/Java-style OO hierarchy into a data-oriented or so setup for
> maximal speed, does anyone know of one?
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>

Warrick Buchanan

2010-10-08 08:49:34 UTC

Permalink

Also I would add to that and say don't be afraid of copying data around - especially when it comes to multi-threaded systems it can be much more preferable than the alternatives. Obviously there are limits to this though as most things and you should take care about the rate your systems produce and consume data.

> Date: Fri, 8 Oct 2010 02:37:31 -0600
> From: ***@beamdog.com
> To: sweng-***@midnightryder.com
> Subject: Re: [Sweng-Gamedev] Data oriented programming
>
> Go with components and limit cross-component communication to clearly
> defined phases:
>
> TickAnims();
> MoveAnimDataToPhysics();
> TickPhysics();
> MovePhysicsDataToAnimation();
> TouchUpAnimsForThingsPhysicsConstrained();
>
> Etc.
>
> Obviously, a lot of the move phases go away if you can find a
> cache-friendly format that works well for most members of a group of
> related subsystems, though often the cost of copying data around is
> well worth the savings you can achieve when a complex computation is
> tuned *just* right.
>
> YMMV.
>
> Phill
>
> On Thu, Oct 7, 2010 at 7:11 PM, OvermindDL1 <***@gmail.com> wrote:
> > On Thu, Oct 7, 2010 at 3:54 AM, Gabriel Sassone <***@gmail.com> wrote:
> >> /* snip */
> >
> > Slightly back on to topic, data oriented programming, I know what most
> > of it means anyway and I do use it in some specific areas, but I am
> > curious, a common use-case for a game:
> >
> > Class hierarchy:
> > GameObject
> > Actor
> > Pawn
> > PlayerPawn
> > Vehicle
> > Wheeled
> > Tracked
> > Flying
> > Controller
> > PlayerController
> > AIController
> > MonsterController
> >
> > I am not certain how to properly get functionality over to a
> > data-oriented model. Just for the sake of ease-of-use argument, the
> > best way might be a component system? But it seems that no matter how
> > I think of data usage, sometimes I will need, for example, the
> > positions with the force descriptions, or I might need positions with
> > the renderer, or I might need the force descriptions with the input to
> > each model (and how do you model an input system based on events, such
> > as you have a list of events and based on what they are they need to
> > modify the structures of one thing per event, with possible multiple
> > events per thing, or maybe zero).
> >
> > I have done quite a bit of looking on Google for any articles or
> > tutorials that anyone may have done to convert a traditionally
> > C++/Java-style OO hierarchy into a data-oriented or so setup for
> > maximal speed, does anyone know of one?
> > _______________________________________________
> > Sweng-Gamedev mailing list
> > Sweng-***@lists.midnightryder.com
> > http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
> >
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com

Richard

2010-10-08 13:52:44 UTC

Permalink

In article <BAY146-***@phx.gbl>,
Warrick Buchanan <***@hotmail.com> writes:

> Also I would add to that and say don't be afraid of copying data around
> [...]

The reasoning behind this being that the cache misses during copying is
paid for by the repeated cache hits during subsequent processing?

Does this assume that the data needs to be accessed more than once
during processing?
--
"The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
<http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/>

Legalize Adulthood! <http://legalizeadulthood.wordpress.com>

Richard Fabian

2010-10-08 14:03:10 UTC

Permalink

I'd say that it's because copying has a latency of only one cache miss no
matter how much you're copying, as you can start processing the next read
straight away as it's not dependent on the arrival of previous data. The
"processing" is just doing more of the same load and store, load and store.
So you're running at memory bandwidth the whole time rather than waiting on
data before doing something.

On 8 October 2010 14:52, Richard <***@xmission.com> wrote:

>
> In article <BAY146-***@phx.gbl>,
> Warrick Buchanan <***@hotmail.com> writes:
>
> > Also I would add to that and say don't be afraid of copying data around
> > [...]
>
> The reasoning behind this being that the cache misses during copying is
> paid for by the repeated cache hits during subsequent processing?
>
> Does this assume that the data needs to be accessed more than once
> during processing?
> --
> "The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
> <http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/>
>
> Legalize Adulthood! <http://legalizeadulthood.wordpress.com>
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>

--
fabs();
Just because the world is full of people that think just like you, doesn't
mean the other ones can't be right.

Fabian Giesen

2010-10-08 19:00:07 UTC

Permalink

On 10/8/2010 6:52 AM, Richard wrote:
>
> In article<BAY146-***@phx.gbl>,
> Warrick Buchanan<***@hotmail.com> writes:
>
>> Also I would add to that and say don't be afraid of copying data around
>> [...]
>
> The reasoning behind this being that the cache misses during copying is
> paid for by the repeated cache hits during subsequent processing?
>
> Does this assume that the data needs to be accessed more than once
> during processing?

Copying by itself is not fundamentally more (or less) efficient than any
other pass that only does sequential reads and writes. If you actually
have do to it manually on the CPU, then all other things being equal,
you're better off just writing to your destination(s) directly during
the preceding pass. There's some caveats - e.g. when accessing
write-combined memory directly, you need to look closely at your
generated code and make sure that it really does write sequentially with
no holes (exact limitations vary by platform; on some processors you can
write "slightly out of order" within cache lines with no penalty, but
others really need writes to be at least word-sized and sequential, so
that's what you should target). Getting a C/C++ compiler to actually do
this properly often involves liberal use of compiler write barriers
and/or "volatile" qualifiers. If the sequential writes cause other
complications (or if you prefer simplicity), it can be better to use a
small block of cached memory as "staging area" and copy from there to
write-combined mem.

Of course, if you have a DMA engine that can quickly (and
asynchronously) move data around memory, the copies really are "free"
(unless you're hitting memory bandwidth limits), which affects the usage
patterns. On the PS3, you have this and the SPUs which can't access main
memory directly, so expect to be copying around data a lot. (On SPUs,
the usual pattern is to DMA output data from block N-1, work on block N,
and DMA input data for block N+1 at the same time).

-Fabian

Phill Djonov

2010-10-08 20:55:31 UTC

Permalink

Well, even just transforming:

foreach( object )
{
//not real function calls, just illustrating the flow of the loop
//if it helps, imagine them all messy and interleaved :)
temp;
FollowPointersAround( object, temp );
BranchyFixingUpOfData( temp );
HeavyNumberCrunching( temp );
}

into:

temp[]
for( ... )
FollowPointersAround( objects[index], temp[index] ); //effectively
what we're referring to as a "copy"
for( ... )
BranchyFixingUpOfData( temp[index] );
for( ... )
HeavyNumberCrunching( temp[index] );

is a win.

It's *much* easier to go into each of those separate loops and measure
and optimize than it is the first. Loops are smaller and highly
focused. Logic mispredictions don't stall the math pipeline. Fewer
temporaries are in play, freeing registers, making unrolling
opportunities more obvious. It makes sense to do expensive
pipeline/cache-flushing CPU-mode switches between stages, or to
utilize write-combining, scratch pads, or other special hardware
features. That sort of stuff can *easily* make up for the cost of
pushing the original temp out of registers or the stack, and often
would even if everything Richard and Fabian said about mitigating copy
costs were false.

And, when the profiler screams, or you move to a new platform, you can
likely fix things in isolation without having to butcher "object" and
every other bit of code that touches it.

Phill

On Fri, Oct 8, 2010 at 7:52 AM, Richard <***@xmission.com> wrote:
>
> In article <BAY146-***@phx.gbl>,
> Warrick Buchanan <***@hotmail.com> writes:
>
>> Also I would add to that and say don't be afraid of copying data around
>> [...]
>
> The reasoning behind this being that the cache misses during copying is
> paid for by the repeated cache hits during subsequent processing?
>
> Does this assume that the data needs to be accessed more than once
> during processing?
> --
> "The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
> <http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/>
>
> Legalize Adulthood! <http://legalizeadulthood.wordpress.com>
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>

Phill Djonov

2010-10-08 08:56:01 UTC

Permalink

You also mentioned events, and I didn't really address that.

I avoid those like the plague. In my experience, having subsystems
directly call each other as they work is a recipe for disaster, as the
code you get is highly interdependent and you lose the ability to
reorganize the internals of a system (for performance or otherwise)
because the entire rest of the game is expecting things of it at all
times. Message queues or buffers of intermediate results, on the other
hand, are awesome:

listOfCollisions = BeginPhysicsTick();
DecideOnCollisionResolutions( listOfCollisions ); //updates the list inline
FinishPhysicsTick( listOfCollisions );

gameCollisions = FindCollisionsGameplayCaresAbout( listOfCollisions );
SpawnCollisionEffects( gameCollisions );
ApplyImpactDamage( gameCollisions );

//use listOfCollisions later to optimize what data you send over the network as
//prediction might do a decent job of objects that are simply moving
along uninterrupted

//etc

Phill

On Thu, Oct 7, 2010 at 7:11 PM, OvermindDL1 <***@gmail.com> wrote:
> On Thu, Oct 7, 2010 at 3:54 AM, Gabriel Sassone <***@gmail.com> wrote:
>> /* snip */
>
> Slightly back on to topic, data oriented programming, I know what most
> of it means anyway and I do use it in some specific areas, but I am
> curious, a common use-case for a game:
>
> Class hierarchy:
> GameObject
> Actor
> Pawn
> PlayerPawn
> Vehicle
> Wheeled
> Tracked
> Flying
> Controller
> PlayerController
> AIController
> MonsterController
>
> I am not certain how to properly get functionality over to a
> data-oriented model. Just for the sake of ease-of-use argument, the
> best way might be a component system? But it seems that no matter how
> I think of data usage, sometimes I will need, for example, the
> positions with the force descriptions, or I might need positions with
> the renderer, or I might need the force descriptions with the input to
> each model (and how do you model an input system based on events, such
> as you have a list of events and based on what they are they need to
> modify the structures of one thing per event, with possible multiple
> events per thing, or maybe zero).
>
> I have done quite a bit of looking on Google for any articles or
> tutorials that anyone may have done to convert a traditionally
> C++/Java-style OO hierarchy into a data-oriented or so setup for
> maximal speed, does anyone know of one?
> _______________________________________________
> Sweng-Gamedev mailing list
> Sweng-***@lists.midnightryder.com
> http://lists.midnightryder.com/listinfo.cgi/sweng-gamedev-midnightryder.com
>