Why?
I'm interested to see if computers can be taught concepts in a better way then we currently teach them. There is a whole field of study in machine learning called embodied cognition (or grounded learning) that speculates computers can never truly be intelligent unless they interact with the world. Under this hypothesis, truly interacting with the world would require computers to have bodies like robots, but the hardware is still not there to enable this to happen. So instead of hardware/robotics, I want to see if we can do software bodies in simulation. I've
collected thousands of concepts from dictionaries and then turned them into mini simulations one by one.
Many of them are very crude, but they are getting better over time.
Each concept is like an encyclopedia entry that contains a text definition and the simulation.
I have tried to code each simulation in a generative way so that each time you view it, it is different.
Some of the simulations are 3D third person, 3D first person, 2D sideview, or 2D overhead, it really depends on the simulation, but in general I try to have a subject, or main character who is in each simulation.
I've integrated physics into some of the simulations, but some of them don't really require it (yet). They are currently all written in javascript so that anyone with a browser can see them. I suspect all of the simulations will be redone over time as more is learned. In coding them up, the most important thing for me is to clearly demonstrate the concept and elegant code. I've categorized the concept data into several main types: verbs, nouns, and adjectives, of course there are other types of words like prepositions, but im most interested in verbs because the embodied cognition hypothesis is about interacting with the world, which means movement and verbs are all about doing things.
Some verbs are very easy to simulate like "flapping", "passing", or "meet", but many verbs are abstract such as "winning", "adapt","allow","concentrate" and so I am have not tackled them yet nor do I know how to represent them.
This is a very hard problem to solve, and also the key problem, how can we represent these ideas in generic ways, how does our mind represent these seemingly abstract ideas.
Not only is it difficult to simulate, but each of those abstract verbs has side effects that must be accounted for in those definitions.
For example with the verb "winning", the defintion is "to be successful at a contest". If the game was soccer, someone might think the simulation is "playing" or "kicking", so how would you show different concepts of playing vs kicking or winning?
There are different internal states that our simulation must capture for them to be effective.
And the concept "to win" must be represented and associated with other concepts like "game", "points","competitors".
And talking about being embodied, to be winning means you are probably exerting lots of effort to out-compete your opponents. How do you represent that in the simulation.
Let's look at the concept/verb "concentrate", the definition is "to think carefully about something", or "focus one's attention or mental effort on a particular object or activity".
There is an internal state that must be modeled here as well, how do you model that? Its like going from looking at many things at the same time to only looking at one thing.
This is pretty hard to simulate, but what I would try to do as a first pass( I haven't tried yet), is to draw 2 scenes: one from the third person showing a bunch of objects in the room, and another view from the subject showing their field of vision narrowing from many things to only one thing.
Machine Learning
So I want to build out this large dataset and then take all the simulations and feed them into machine learning algorithms to try different experiments around classification, reinforcement learning, unsupervised learning ,and more.
My ultimate goal is to come up with some new neural representation to store these concepts in new ways that may help us make more intelligent computers.
Composability of these concepts is of great interest to me:
in Chinese for example, fire + car = train, electric + brain = computer.
This composability works in every language like combining adjectives with nouns: big cat, skinny dog, fat human, etc, but in computers composability of concepts doesn't work very well.
Help
This is a large project that spans computer graphics , machine learning, deep learning, physics, linguistics and more. I've been thinking about this problem space for a very long time, but didn't know how to start, so instead of just "spinning my wheels" and over thinking everything, I built these mini simulations to start and plan to readjust on the way.
if any of this sounds interesting, please contact me at jtoy@jtoy.net .