System Designer: Ideas for a Simulation Framework

Francis Tseng

hello, name
thank you Bernardo for the invitation
software developer who work with simulation and machine learning
co-publisher of TNI
been working with Bernardo on SEAL
in the past, interaction designer at IDEO, fellow at the New York Times/Washington Post

caveat: I'm very much arriving to the practice of simulation as an outsider

Differential equations

there are different simulation methodologies

System dynamics

system dynamics
stocks and flows

Agent-based modeling

Motivation

motivation behind my interest in simulations

Wicked problems

difficult to agree on the problem (or whether there's a problem at all)
there is no comprehensive perspective
wicked problems are interconnected
no way to verify or experiment solutions
there may be (very) delayed feedback loops
every wicked problem is unique
are nonlinear (outputs are disproportionate to the inputs)
they are massive & intimidating
wicked problems, it seems, beget more wicked problems

Rittel, H. W., & Webber, M. M. (1973). Dilemmas in a general theory of planning. Policy sciences, 4(2), 155-169.

How do you make sense of nauseatingly complex systems?

this question seems central to contending with wicked problems,
providing better ways of understanding them, providing evidence for
strategies, and so on.

The Image of the City, Kevin Lynch

urban planner Kevin Lynch interviews people in multiple cities to understand how they form "mental maps" of their cities. what is their "mental image" of the city? their intuitive understanding that allows them to navigate it how they like? where does it come from, how does it develop?

Cognitive mapping

a situational representation on the part of the individual subject to that vaster and properly unrepresentable totality which is the ensemble of society's structures as a whole (Postmodernism, Fredric Jameson)

this "urban mental map" is one example of the more general way we form understandings of the world, called "cognitive mapping". Fredric Jameson defined it as such.

cognitive maps are how we make sense of the world, how we orient ourselves within it, and how we understand how to talk about and approach problems.

Representational infrastructure

(Wilensky & Papert)

no single piece of information or media completely forms these cognitive maps. they aggregate over time as we pick up different frameworks for seeing the world, have new and unexpected experiences, communicate with others, and so on.

the bits and pieces that contribute to the formation of our cognitive maps can be called "representational infrastructure", a term I first encountered through Wilensky & Papert.

for example, maps are one form of such infrastructure for navigating physical space. another might be a particular cinematic practice, or a written form.

"Crisis of representation", "Representational breakdown"

the representational infrastructure we've come to rely on for navigating and making sense
of the world is increasingly inadequate.
for example: with increased mobility, people moving from city to city,
it may be more difficult to develop a comprehensive "image of a city".
you don't spend enough time to do so, and if you're transient, you may not be
invested in even trying to do so.

Alienation

an inability to cognitively map the mechanisms and contours of the world system is as debilitating politically as being unable to mentally map a city would be for a city dweller. (Jeff Kinkle, referring to Fredric Jameson)

Jeff Kinkle, referring to Jameson, uses the term "alienation" to describe the debilitation that results from an inability to form appropriate cognitive maps. a constant sense of being lost, disoriented, powerless.

in some domains where shifting lifestyles, cultural norms, and so on disrupt our preexisting capacity for forming cognitive maps, we develop new tools to help mitigate these problems. some of these tools, like Google Maps obviate the need for representational infrastructure altogether. You dont necessarily need to form a mental map of the city if you have an external one you can always refer to.
(though this is certainly not without its costs, for example, if we become too reliant on a service like google maps and lose a sense of our own city)

New representational infrastructures

e.g. Roman -> Hindu-Arabic numerals

(Wilensky & Papert, 2006;2010)

instead of tools that render a representational infrastructure unnecessary, we can create new representational infrastructures. the most compelling example of this, put forward by Wilensky & Papert, is the shift from Roman numerals, which made many kinds of arithmetic clunky and difficult, to Arabic numerals, which were generally easier to work with.

Donella H. Meadows (Thinking in Systems):

there is a problem in discussing systems only with words. Words and sentences must, by necessity, come only one at a time in linear, logical order. Systems happen all at once.

... Pictures work for this language better than words, because you can see all the parts of a picture at once.

Donella Meadows also offers a compelling example when contrasting pictures against words. Each form of communication has certain affordances that make it better or worse for conveying certain ideas.

Bureau d´Études' An Atlas of Agendas

pictures certainly have their strengths, but when reckoning with very intricate systems, they still fall short.
This is a project by Bureau d'Etudes mapping out various organizations, corporations, and so on and their relations to one another.
but as you can probably see, it's pretty overwhelming.

Perhaps...simulations?

Wilensky & Papert propose simulations (in particular, agent-based models) as a critical new kind of representational infrastructure

Why simulation?

way of interactively learning about a particular system, and systems in general
a canvas for alternatives, exploring policy-space
a means for discourse, a new way of making arguments
exploration & education, experience generation
- depending on framing, presentation, and design, can appeal to many different audiences; e.g. games may lack rigor but have wider appeal

A canvas for alternatives, exploring policy-space

Instead of seeing an individual proposal as “right or wrong”, “bad or good”, people can see it as one point in a large space of possibilities. By exploring the model, they come to understand the landscape of that space, and are in a position to invent better ideas for all the proposals to come. Model-driven material can serve as a kind of enhanced imagination.

What can a technologist do about climate change?, Bret Victor

A means of discourse, a new way of making arguments

Procedural rhetoric:

an argument made by means of a computer model. A procedural rhetoric makes a claim about how something works by modeling its processes in the process-native environment of the computer rather than using description (writing) or depiction (images). (Persuasive Games: The Proceduralist Style, Ian Bogost)

Countermapping / Radical cartography

Detroit money transfers (Fitzgerald: Geography of a Revolution, William Bunge)

The Detroit Geographical Expedition and Institute (DGEI, William Bunge & Gwendolyn Warren)

Exploration & education, experience generation

"Explorable explanations":

What if a book didn't just give you old facts, but gave you the tools to discover those ideas for yourself, and invent new ideas? What if, while reading a blog post, you could insert your own knowledge, challenge the author's assumptions, and build things the author never even thought of... all inside the blog post itself? (Nicky Case)

Example: Parable of the Polygons (Nicky Case & Vi Hart)

Humans of Simulated New York (HOSNY)

here's one example simulation I worked on with my collaborator Fei Liu
we referenced some of Bernardo's work
it's a basic economic simulation which we designed to be participative, as a way of speculating future scenarios

Inspirations

Cybersyn

Dwarf Fortress

Dwarf Fortress: Boatmurdered

Data

Frequently Occurring Surnames from the Census 2000. Surnames occurring >= 100 more times in the 2000 census. (details here)
Female/male first names from the Census 1990
Household and individual IPUMS data for 2005-2014, retrieved from IPUMS USA, Minnesota Population Center, University of Minnesota
PUMS network map of NY, hand-compiled from the NYC PUMA map
NYC unemployment data was retrieved from New York State Department of Labor
S&P500 data was retrieved from Open Knowledge's Standard and Poor's (S&P) 500 Index Data including Dividend, Earnings and P/E Ratio
Annual expenses calculated from Living Wage Calculator (Amy K. Glasmeier, Carey Anne Nadeau, Eric Schultheis, 2014).

design document
we had a lot of systems going on, in retrospect not super coherent, but interesting to combine

Parameterizing the world

Participatory simulation

Participatory simulation

Population generation: Bayes Net

>>> from people import generate
>>> year = 2005
>>> generate(year)
{
    'age': 36,
    'education': <Education.grade_12: 6>,
    'employed': <Employed.non_labor: 3>,
    'wage_income': 3236,
    'wage_income_bracket': '(1000, 5000]',
    'industry': 'Independent artists, performing arts, spectator sports, and related industries',
    'industry_code': 8560,
    'neighborhood': 'Greenwich Village',
    'occupation': 'Designer',
    'occupation_code': 2630,
    'puma': 3810,
    'race': <Race.white: 1>,
    'rent': 1155.6864868468731,
    'sex': <Sex.female: 2>,
    'year': 2005
}

Social networks: Confidant Model

Social Distance in the United States: Sex, Race, Religion, Age, and Education Homophily among Confidants, 1985 to 2004. Jeffrey A. Smith, Miller McPherson, Lynn Smith-Lovin. University of Nebraska - Lincoln. 2014.

New York City American Community Survey data spanning 2005-2014

Agent preferences: Utility Functions

Agent preferences: Utility Functions

Planning agents: A* Search

"What sequence of actions today do I expect to make me happiest?"

(too expensive)

Learning agents: Q-Learning

Economy design

Agent-based model of economics: Market mechanisms, decision making, taxation. Leonid Hulianytskyi, Diana Omelianchyk. International Journal "Information Technologies & Knowledge" Volume 9, Number 1, 2015. 25-33.
An agent-based model of a minimal economy. Christopher K. Chan. May 5, 2008.
A simple agent-based spatial model of the economy: tools for policy. Bernardo Alves Furtado, Isaque Daniel, Rocha Eberhardt. Feburary 9, 2016.
Modeling Complex Systems for Public Policies. Editors: Bernardo Alves Furtado, Patrícia A. M. Sakowski e Marina H. Tóvolli/ Brasília, 2015.

Takeaways from HOSNY

1. Developing complex simulations is really difficult

Debugging is difficult, by virtue of the model's complexity
How do you validate the model?

2. Developing large simulations is really difficult

Quickly run into limitations of the browser (for web-based visualizations)
Large populations with complex/nuanced behaviors quickly run into speed and/or memory errors
Distributed methods quickly become necessary, but may introduce additional complications

3. The process of developing the simulation is just as enlightening as the simulation itself

we learned a lot during the simulation's design and development.
we originally focused on how the end result would help others explore and learn.
but then we became interested in how we can involve others in the process of developing the simulation as well.

4. Many tenuous assumptions need to be made in the simulation's design

5. Machine learning methods are quite useful

Shortcomings of simulations

Computational limitations
Accessibility in design
Necessarily built on assumptions which may not be immediately apparent
People tend to see computational models as infallible

from these main takeaways, some shortcomings became clear

Simulation assumptions

To me, the most troubling shortcoming is the amount of assumptions that need to be made in the development of a simulation. This is especially troubling given the last shortcoming I mentioned - that people may neglect to consider these assumptions if they themselves assume that computational models have some kind of rigor that really may not be there.

Obligatory George Box quote:

all models are wrong, but some are useful
dangers of "social physics", "the view from nowhere", "big data as reality", & co...

SimCity 5 design

To succeed even within the game’s fairly broad definition of success (building a habitable city), you must enact certain government policies. An increase in the number of police stations, for instance, always correlates to a decrease in criminal activity; the game’s code directly relates crime to land value, population density, and police stations. Adding police stations isn’t optional, it’s the law.

Les Simerables, Ava Kofman

this is from a fantastic essay by Ava Kofman examining the assumptions that the SimCity games bake into their gameplay.

"The Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following Generations...saw that that vast Map was Useless."
- On Exactitude in Science (Jorge Luis Borges)

similarly, simulations necessarily leave some things out. if we completely reproduced the complexity and detail in the system we're modeling, what are we going to learn from the model? we'd have the same difficulties in parsing out causal relationships and generating more concise theories.

Tools for Simulations

many of these shortcomings can be partially addressed
by better tooling.

I started building out a lower-level library in Rust to support some of these features, Djinn,
and Fei and I started working on a higher-level library in Python to implement other sets of these features, but both remain somewhat incomplete

there are some tools, such as netlogo and repast, but as far as I know, they lack most of what I describe here.

Easy data ingestion and model learning

State Space Reconstruction: Takens' Theorem and Shadow Manifolds (Sugihara Lab)

of course, when dealing with lots of data from many different sources, a lot of time must be spent cleaning and formatting the data. that process is so dataset-specific that I don't think we'll be able to offer much help there.

but we could make it easier for people to parameterize subsystems in a simulation from real-world data

e.g. learning a Bayes net from census data as we did with HOSNY

dont want to make this too easy, or people will not scrutinize the results of these learned models.

Transparent distributed support

Write as if the simulation runs on a single computer
- No dealing with race conditions, interprocess communication, etc

make it easy for users to write their simulations as if it were running on a single computer
but in the background, able to scale to any n number of computers on a network
this is very challenging to do in a generalized way, because once you distribute a simulation's actors across a network, you introduce network latency, which can obliterate any parallel speed gains. What you want to do is come up with some heuristic for co-locating actors that will communicate frequently onto the same core. But this heuristic is ultimately application-specific.

Validation & Calibration

as I mentioned before, validation and calibration are difficult problems. frankly, I do not know how to best approach these. but I do know that it should be an easier process.

Multiscale Modeling, Level-of-Detail (LOD), Attention/Focus

one way to contend with the computational requirements of large simulations is to
simulate different parts/scales of the world at different resolutions,
depending on the user's interest.

perhaps if I'm focusing on Brazil, the other parts of the world are simulated in less detail.
but once I move my focus to another country, then that is simulated in higher detail.

perhaps if I'm not viewing an entire country but just a city, then that city is simulated in greater detail than the rest of the country. we don't stop simulating other areas entirely, because it's possible that events out there will influence events in our area of focus.

if I focus down to an individual, I may want to see their thought process, and see every detail of their decisions. we may want to know exactly what route they take to get to work. but at the level of a city, we may not really care about that detail.

Spatial support

many agent-based models excel because they are able to model space in a way that is difficult for other methodologies. the ideal framework would provide out-of-the-box support for continuous spaces and discrete spaces of different forms (grid, hex, etc).

Pathfinding and planning support

when dealing with any spatial agent-based model, you will inevitably need to model how agents and other resources move through a space, whether it's by foot, or by car, or by train, etc. so pathfinding is necessary, especially weighted pathfinding, which allows you to more easily model various complications that come up in pathfinding. for instance, alternative forms of transportation can be expressed as different grid weights, desirability of certain paths can be taken into account (as opposed to only choosing paths based on distance).

and because pathfinding is a general method applicable to any graph, it can be applied to non-spatial problems, e.g. decision making/planning flows.

Utility function designer

Presentation & Interactivity

one of the priorities with this new framework is model accessibility. we want people outside of policy and academia to engage with and develop these simulations.

the way to make models most accessible is to host them on the web

support for building arbitrary web frontends on top of models
consume output in real-time from a running simulation, which can be visualized in any arbitrary way
also capable of sending real-time input to the simulation to change parameters, experiment with interventions
more and more people are learning how to build with web technologies, so this potentially opens the door for people to build their own interfaces or visualizations on top of the work of others

Modeling Ecosystem

we also want to encourage a culture of and community around simulation, so that it becomes a more common practice. to do so, we need some platform where people can work collaboratively and share models they've created. sort of like how GitHub has influenced programming.

modeling versioning, to answer questions of model provenance

Modeling Ecosystem

diffing, to see meaningful differences between versions of models

forking, so others can build off of what others have made. given that models generally require making assumptions, others can fork and change the model if their assumptions differ.

interchange format, so models can be exported into some portable format

we don't just want to make model development collaborative, but also provide support for publishing the results of these models in a way that tracks their provenance.

publishing, so models can be published with detailed run reports, version and authorship information, all output and logs/traces, what input data was used, and perhaps a system for discussion

Atomic Behaviors

many models within certain domains have common behavior requirements
would be great if the process of designing and assembling a simulation were like playing with legos
basically, "atomic behaviors" are like plugins that users can attach to their agents so they implement common behaviors.

Atomic Behaviors

Could include:

homophily/sorting
communication/language
reputation
promises/threats (conditional behaviors and assessing likelihood of follow-through)
temporal discounting
requests (i.e. asking someone to do something)
planning
simple learning (e.g. Q-learning)
mortality (agent death under certain conditions)
reproduction (new agents under certain conditions)

though, the interactions of these behaviors may lead to unintended consequences for users.

Atomic Behaviors

Example: Agent Zero, an attempt at creating a "first-principles" agent integrating rational, affective, and social components

Thank you

thank you again Bernardo for inviting me. I'm looking forward to learning more about all of your work here.