{"id":125,"date":"2024-04-09T18:01:00","date_gmt":"2024-04-09T18:01:00","guid":{"rendered":"https:\/\/permutationcity.co.uk\/bp\/?p=125"},"modified":"2024-07-27T08:52:51","modified_gmt":"2024-07-27T08:52:51","slug":"data-oriented-design","status":"publish","type":"post","link":"https:\/\/permutationcity.co.uk\/bp\/2024\/04\/09\/data-oriented-design\/","title":{"rendered":"Data-oriented design"},"content":{"rendered":"\n<p>I read <a href=\"https:\/\/www.dataorienteddesign.com\/dodbook.pdf\">Data-Oriented Design<\/a> by Richard Fabian\nbut will mostly cover the methodology (DoD) not the book.\nLet&#8217;s get it out of the way and I&#8217;ll try not to belabour it:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>I didn&#8217;t enjoy this book and wouldn&#8217;t recommend it.<\/li>\n\n\n\n<li>The author seems to have a chip on his shoulder about object-orientation and he spends too long saying it&#8217;s bad. He should have spent more time on DoD instead, that&#8217;s what I wanted to learn.<\/li>\n\n\n\n<li>I had to read some sections several times trying to understand his use of negatives and double negatives. It could have been written in a more straight forward way.<\/li>\n\n\n\n<li>The sample code are both not to my taste and inconsistent, mixing different styles even within one example.<\/li>\n\n\n\n<li>Surely there must be something better than this.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"its-about-all-the-data\">It&#8217;s about all the data<\/h2>\n\n\n\n<p>Data-oriented design concentrates on the underlying data in any given system.\nData tends to be stored in a very simple way as integers, floats and strings within arrays.\nIt does allow for common structures for the domain such as a 3D vector, meshes or images.\nThe book spends a chapter on relational databases and normalising your data into standard forms.\nYou probably won&#8217;t be using a database but the data is laid out in a similar way.\nThere are a couple of reasons for doing this.<\/p>\n\n\n\n<p>I can get behind the first reason.\nBy paying attention to how your data is laid out in memory you cluster data that is used together in an operation.\nThis means that when the CPU fetches new data into the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Cache_(computing)\">cache<\/a>\nthat most or all of that data will be relevant.\nEvery single byte in the cache line will needed by the operation and is used by the operation.\nNo extra time will be spent waiting for memory reads until another entire cache line is needed.\nWithout this attention data is often laid out so that all the data for one owner is together.\nOperations tend to need some but not all of the data for one owner.\nMost of each cache line is taken up by data irrelevant to this particular operation.\nThe process will often have to wait for new cache lines to be read.\nBy putting extra effort into data layout you get higher performance.<\/p>\n\n\n\n<p>I wasn&#8217;t convinced by the second reason.\nHe feels that this alternative layout is better for extensibility, debugging and probably other things as well.\nIt is true that new features would tend to introduce new arrays rather than modify existing ones.\nHe acknowledge any difficulty from keeping track of all these separate arrays or synchronising data between them.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"example-particle-system\">Example particle system<\/h2>\n\n\n\n<p>I made a few very simple particle systems to look at some performance numbers.\nThe test creates a number of particles and randomly assigns a velocity to 10% of them.\nThe particles are them simulated for a number of frames.\nThe faster the simulation can be finished the better.\nThere are two object-oriented designs and two data-oriented designs:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Object-oriented design:\n<ul class=\"wp-block-list\">\n<li>Simulate moving particles &#8211; Moving particles are updated every frame, static particles are skipped.<\/li>\n\n\n\n<li>Simulate every particle &#8211; Every particle is updated every frame, static particles not skipped they just have zero velocity.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><p>Data-oriented design:<\/p>\n<ul class=\"wp-block-list\">\n<li><span style=\"background-color: var(--wp--preset--color--base); color: var(--wp--preset--color--contrast); font-family: var(--wp--preset--font-family--system-font); font-size: var(--wp--preset--font-size--medium);\">Indexed velocity &#8211; <\/span><span style=\"color: var(--wp--preset--color--contrast); font-family: var(--wp--preset--font-family--system-font); font-size: revert; background-color: var(--wp--preset--color--base);\">Every particle has a position but only moving particles have a velocity.<\/span><span style=\"background-color: var(--wp--preset--color--base); color: var(--wp--preset--color--contrast); font-family: var(--wp--preset--font-family--system-font); font-size: var(--wp--preset--font-size--medium);\"> <\/span><span style=\"color: var(--wp--preset--color--contrast); font-family: var(--wp--preset--font-family--system-font); font-size: revert; background-color: var(--wp--preset--color--base);\">Moving particles are updated every frame.<\/span><\/li>\n\n\n\n<li>Velocity &#8211; <span style=\"color: var(--wp--preset--color--contrast); font-family: var(--wp--preset--font-family--system-font); font-size: revert; background-color: var(--wp--preset--color--base);\">Every particle has a position but only moving particles have a velocity.<\/span><span style=\"color: var(--wp--preset--color--contrast); font-family: var(--wp--preset--font-family--system-font); font-size: var(--wp--preset--font-size--medium); background-color: var(--wp--preset--color--base);\"> Moving particles are updated every frame.<\/span><span style=\"color: var(--wp--preset--color--contrast); font-family: var(--wp--preset--font-family--system-font); font-size: revert; background-color: var(--wp--preset--color--base);\"> Static particles are kept in their own array out of the way.<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;clike&quot;,&quot;mime&quot;:&quot;text\/x-c++src&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;C++&quot;,&quot;language&quot;:&quot;C++&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;cpp&quot;}\">\/\/ Object-oriented design\nstruct Particle {\n    Vector3 position;\n    Vector3 velocity;\n    \/\/ Extra data to represent particle name, colour, lifetime, etc.\n    ...\n};\nstd::vector&lt;Particle&gt; particles;<\/pre><\/div>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;clike&quot;,&quot;mime&quot;:&quot;text\/x-c++src&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;C++&quot;,&quot;language&quot;:&quot;C++&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;cpp&quot;}\">\/\/ Data-oriented design, indexed velocity\nstruct IndexedVelocity {\n    size_t index;\n    Vector3 value;\n};\nstd::vector&lt;Vector3&gt; positions;\nstd::vector&lt;IndexedVelocity&gt; velocities;<\/pre><\/div>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;clike&quot;,&quot;mime&quot;:&quot;text\/x-c++src&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;C++&quot;,&quot;language&quot;:&quot;C++&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;cpp&quot;}\">\/\/ Data-oriented design, velocity\nstd::vector&lt;Vector3&gt; movingPositions;\nstd::vector&lt;Vector3&gt; velocities;\nstd::vector&lt;Vector3&gt; staticPositions;<\/pre><\/div>\n\n\n\n<p>You can <a href=\"https:\/\/quick-bench.com\/q\/AHPhluktJ7dQiKxoZKnH5ttZIfY\">take a look<\/a> at the full source and the performance but here is the graph:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"511\" src=\"https:\/\/permutationcity.co.uk\/bp\/wp-content\/uploads\/2024\/04\/data-oriented_10percent-1024x511.png\" alt=\"\" class=\"wp-image-123\" srcset=\"https:\/\/permutationcity.co.uk\/bp\/wp-content\/uploads\/2024\/04\/data-oriented_10percent-1024x511.png 1024w, https:\/\/permutationcity.co.uk\/bp\/wp-content\/uploads\/2024\/04\/data-oriented_10percent-300x150.png 300w, https:\/\/permutationcity.co.uk\/bp\/wp-content\/uploads\/2024\/04\/data-oriented_10percent-768x383.png 768w, https:\/\/permutationcity.co.uk\/bp\/wp-content\/uploads\/2024\/04\/data-oriented_10percent.png 1450w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Both object-oriented designs show similar performance,\nthere isn&#8217;t really a performance gain from trying to skip the vector addition.\nThe first set of bars <code>UpdateMovingParticle&lt;4&gt;<\/code> to <code>UpdateMovingParticle&lt;64&gt;<\/code> show that the more\njunk data in the struct the slower the simulation.\nThe second set of bars <code>UpdateEveryParticle&lt;4&gt;<\/code> to <code>UpdateEveryParticle&lt;64&gt;<\/code> shows the same.<\/p>\n\n\n\n<p>Both data-oriented designs show much better performance.\nThe <code>UpdateIndexedVelocities<\/code> is 6 times faster than the fastest <code>UpdateMovingParticle<\/code>.\nThe <code>UpdateVelocities<\/code> is 14 times faster than the fastest <code>UpdateMovingParticle<\/code>.\nHaving all the data in one place really does make a difference.<\/p>\n\n\n\n<p>However details matter, if 50% of the particles are moving we get this:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"511\" src=\"https:\/\/permutationcity.co.uk\/bp\/wp-content\/uploads\/2024\/04\/data-oriented_50percent-1024x511.png\" alt=\"\" class=\"wp-image-124\" srcset=\"https:\/\/permutationcity.co.uk\/bp\/wp-content\/uploads\/2024\/04\/data-oriented_50percent-1024x511.png 1024w, https:\/\/permutationcity.co.uk\/bp\/wp-content\/uploads\/2024\/04\/data-oriented_50percent-300x150.png 300w, https:\/\/permutationcity.co.uk\/bp\/wp-content\/uploads\/2024\/04\/data-oriented_50percent-768x383.png 768w, https:\/\/permutationcity.co.uk\/bp\/wp-content\/uploads\/2024\/04\/data-oriented_50percent.png 1450w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>The <code>UpdateMovingParticle<\/code> approach is almost uniformly slow.\nIt tries to skip unnecessary additions but that seems to be messing with\n<a href=\"https:\/\/en.wikipedia.org\/wiki\/Branch_predictor\">branch prediction<\/a>.\nThe other approaches are slower because more work has to be done but have the same speed relative to one another.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"how-to-layout\">How to layout<\/h2>\n\n\n\n<p>In object-oriented design a collection of data tends to be an array of structs (AoS).\nAs we&#8217;ve seen this tends to leave gaps between the data for a given operation.\nIn data-oriented design a collection of data tends to be a struct of arrays (SoA).\nThis means that relevant data is all tightly packed together.<\/p>\n\n\n\n<p>It&#8217;s not just about switching from AoS to SoA.\nIn <a href=\"https:\/\/www.dataorienteddesign.com\/dodbook\/node4.html\">existential processing<\/a>\nyou avoid performing checks because you already know the answer for this data.\nYou can avoid storing and checking boolean or enum states by storing items in an array specific to that state.<\/p>\n\n\n\n<p>The book uses an example of enemies in a video game with regenerating health.\nIt&#8217;s tempting to have an array of floats to represent the health for each enemy.\nRunning through all enemies to see if they needed to regenerate would be wasteful.\nA first step at optimisation would involve using separate arrays for injured and healthy enemies.\nRunning through the injured enemies reduces the amount of processing.\nA second step at optimisation might discard the array for healthy enemies entirely.\nIf you know they are healthy then presumably they have the default health value for that enemy type.\nThis reduces the overall memory overhead and setup processing.<\/p>\n\n\n\n<p>Later on the book covers hierarchical level of detail and a similar tick that can be used.\nLets say a wave of enemies is made up of squads and a squad is made up of soldiers.\nIf a wave is far away from the player it is not necessary to simulate soldiers or even squads.\nInstead the wave can be abstracted, perhaps as a central position and radius.\nAs the wave and player get closer first squads and then individual soldiers can be spawned.\nAs the wave and player get further away soldiers and squads are de-spawned.\nIf the player fights an injures a solider then it would be nice to show this even if the solider is re-spawned.\nThe essentials of the solider can be stored in a <em>memento<\/em> that is attached to it&#8217;s squad.\nWhen the solider is re-spawned the memento provides the information to turn it from\na generic solider to a more specific one.<\/p>\n\n\n\n<p>So pack the data you are going to use at one time as tightly as possible. If you can get away with have little or no data by using context then do so.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"a-halfway-house\">A halfway house<\/h2>\n\n\n\n<p>In-between object-oriented design and data-oriented design is component-oriented design.\nThis uses smaller structs or components and combines them together to make the functionality for each entity.\nThis can be organised in two ways:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Each entity is explicit, an object and contains a collection of components.<br>This is how the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Unity_(game_engine)\">Unity game engine<\/a> works.<\/li>\n\n\n\n<li>Each entity is implicit, there is nothing but components some of which relate to the same entity.<\/li>\n<\/ul>\n\n\n\n<p>Managers can then be used to update all the components related to one operation, say,\nrendering or physics at one go.\nYou get some of the performance benefit of DoD but with fewer structural changes.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"optimisation\">Optimisation<\/h2>\n\n\n\n<p>The book has a few good ideas here, often with a game focus.<\/p>\n\n\n\n<p>Take a planned approach to optimisation.\nGive each system a time budget and make sure they stick within that budget.\nRendering, physics and similar systems can be given a certain number of milliseconds per frame.\nIf you add up all the budgets you know the milliseconds per frame and therefore the frame-rate.<\/p>\n\n\n\n<p>Removing an item from an vector is normally an <a href=\"https:\/\/en.wikipedia.org\/wiki\/Big_O_notation\">O(n)<\/a> operation.\nHowever that&#8217;s only if you need the vector to stay in the same order.\nIf you don&#8217;t care about the order then it&#8217;s O(1). Something like:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:&quot;language&quot;,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;clike&quot;,&quot;mime&quot;:&quot;text\/x-c++src&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;C++&quot;,&quot;language&quot;:&quot;C++&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;cpp&quot;}\">template &lt;class Iterator&gt;\nvoid unstable_erase(std::vector&lt;Type&gt;&amp; values, Iterator it) {\n    auto last = std::prev(values.end());\n    if (it != last) {\n        std::swap(*it, *last);\n    }\n    values.erase(last);\n}<\/pre><\/div>\n\n\n\n<h1 class=\"wp-block-heading\" id=\"object-oriented-vs-data-oriented-design\">Object-oriented vs data-oriented design<\/h1>\n\n\n\n<p>The author of the book had the general option that data-oriented design was better.\nIn the introduction he allows that an object-oriented project can be setup faster and\nthat there is a more obvious mapping between the real world problem and the code.\nPresumably he judges that extra setup time and a harder job translating problem into code is worth it.\nHe says a lot of other negative things about object-oriented design but it&#8217;s so extreme I have to question it.<\/p>\n\n\n\n<p>I have been persuaded that in performance critical systems a data-oriented design has a big advantage.\nHowever to get that performance benefit I think my time and thought has to be spent developing the code.\nI worry about other consequences.\nThis database like approach is very open.\nAll the data is exposed to everyone and anyone can change it.\nIf they follow all the rules for manipulating the data then it&#8217;s fine.\nHowever as <a href=\"https:\/\/permutationcity.co.uk\/bp\/2024\/01\/31\/we-all-make-mistakes\/\">we all make mistakes<\/a>\nI don&#8217;t generally trust everyone, even myself, to follow the rules.\nThe limited interfaces and private variables of object-oriented design protect the developer from themselves\nand everyone else.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">On balance<\/h2>\n\n\n\n<p>I think it&#8217;s a situation of advantages and disadvantages on both sides. If final performance is most important and development effort is less important then use data-oriented design. If final performance is less important and development effort is more important than use object-oriented design. Maybe use some of one and some of the other depending on the system.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I read Data-Oriented Design by Richard Fabian but will mostly cover the methodology (DoD) not the book. Let&#8217;s get it out of the way and I&#8217;ll try not to belabour it: It&#8217;s about all the data Data-oriented design concentrates on the underlying data in any given system. Data tends to be stored in a very [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","footnotes":""},"categories":[34],"tags":[20,27,7],"class_list":["post-125","post","type-post","status-publish","format-standard","hentry","category-review","tag-books","tag-paradigm","tag-performance"],"_links":{"self":[{"href":"https:\/\/permutationcity.co.uk\/bp\/wp-json\/wp\/v2\/posts\/125","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/permutationcity.co.uk\/bp\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/permutationcity.co.uk\/bp\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/permutationcity.co.uk\/bp\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/permutationcity.co.uk\/bp\/wp-json\/wp\/v2\/comments?post=125"}],"version-history":[{"count":6,"href":"https:\/\/permutationcity.co.uk\/bp\/wp-json\/wp\/v2\/posts\/125\/revisions"}],"predecessor-version":[{"id":203,"href":"https:\/\/permutationcity.co.uk\/bp\/wp-json\/wp\/v2\/posts\/125\/revisions\/203"}],"wp:attachment":[{"href":"https:\/\/permutationcity.co.uk\/bp\/wp-json\/wp\/v2\/media?parent=125"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/permutationcity.co.uk\/bp\/wp-json\/wp\/v2\/categories?post=125"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/permutationcity.co.uk\/bp\/wp-json\/wp\/v2\/tags?post=125"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}