Interpretation of “Neural Network as the World”

Stephane H. Maes

September 14, 2020

Abstract:

Recently a controversial series of paper ends up proposing the possibility that the universe be a neural network by observing that with an irreversible thermodynamics model of the learning process of the neural network, it might appear possible to model quantum and quantum physics as well as observe emergence of a General Relativity space time and gravity, and the plausibility to construct a generalized holographic principle beyond the AdS/CFT correspondence conjecture. The approach has been received with some skepticism.

In this paper[1], we present our suggestions to interpret the results including how NN models could relate to the Wigner wonder at why mathematics describe the Physical world.

____

1. Introduction

[2] shows that if information theory is modeled with (covariant) irreversible / non-equilibrium thermodynamic processes then, close to equilibrium, the conjugate thermodynamics variables of the information content (tensor) is an emerging spacetime following the Hilbert Einstein spacetime. This result is to be related to [4], that derives emergence of quantum mechanics from classical irreversible thermodynamics. Away from equilibrium, the picture is less clear. We note that the irreversibility has to be directly related to the quantum behavior.  

Following on these results, [3] proposes a thermodynamics model for Machine Learning (ML) and derives a proposal for Thermodynamics of learning. NN are example of ML, but we know that any AI or ML algorithm can always be modeled as a NN [12-14].

[1] then models NN thermodynamics, using [3] and inspiring from [2,4] and shows:

  • Close to equilibrium and when the entropy contributions from learning are small, one can recover a Schrödinger equation and a wave function that results from the stochastic dynamics of the training variables randomly trying to find where to go to learn. It amounts to small scale events, trying different evolution to find hints of the best ones, not really changing much with respect to what the NN has learned, and it denotes a state of the NN, where equilibrium has been reached, and new variables values for the models are randomly visited just in case they could help or because learning continues.
  • Further away from equilibrium, where random fluctuations of the qi, the learning variables, are smaller and less visible, and hence at larger scales, and therefore when learning process dominates the thermodynamic, the training variable have an evolution that can be characterized by a classical Hamiltonian and therefore can be modeled by classical Physics. It corresponds to a state of the NN, where it can estimate how to progress to learn or improve the loss/cost function (think of gradient like steepest gradient descent methods for learning/training/optimization).
  • When modeling directly the dynamics of the state of the neurons, [2] applies and under suitable conditions (close to equilibrium and with weak interaction between the neurons (at least when nonlocal)), the dynamics of the neurons follows Einstein’s GR field equations.
  • Analyzing In and Out layers of the NN versus hidden layers, one can hypothesize ways to recover a generalized holographic principle that would link a quantum mechanically dominated NN (In + Out layers) to a deep / many layers NN dominated by gravity.

Section 2 presents additional considerations on what we can learn from [1].

However, this model does not model entanglement yet. It is a key missing part before we can claim to have a truly complete quantum model emerging from [1]. A proposal to that effects is presented in [5] and it is interpreted as hinting the multi-fold mechanisms proposed in [11].

2. A NN model of the world? An alternate interpretation

[1] proposes that the universe is a NN. We do not believe that this is the only interpretation of the results presented in [1] and we want to propose an alternative explanation. As already mentioned, the NN approach can be seen as a model of the dynamics of Physics in the universe. Such model is mathematical, in fact it is a consequence of Hilbert 13th problem and the ability to model any system with deep hidden layers and in particular NN as demonstrated with the Kolmogorov-Arnold representation Theorem [6] and the Universal approximation theorem [7].

In the present case the dynamics of the state variables, i.e. the equation of motion, are the approximated functions. Per the theorems above we know such approximation is (almost) always possible (up to discontinuities) and to any desired degree of accuracy (for the right optimization strategy in the case of NN).

What is interesting, is that if the algorithm for loss/cost function optimization relies on (classical) Thermodynamics (for Irreversible and for non-equilibrium processes with a Free Energy model), it uncovers naturally the dynamics described in section 1 and in [1], where the fact that the NN includes also the model of the learning processes allows to capture in one shot dynamics of the physical system (i.e. the universe) and the dynamic of information processing; therefore concretizing the physical information theory aspects also (e.g. see [8] for related aspects of physical information theory); something that now can be captured into a common Thermodynamics (and physical) model. It goes beyond [4] and justifies considerations like Learning’s Thermodynamics or the principle of conservation of information. In our view, much more than having a NN modeling (or being per [1])

the universe, the key aspect is that we have a complete model for physical and information entropy modeling and computing.

In such a model, it makes sense that entropy extremization and action extremization become equivalent or dual. It is also natural to see that, at small scales, quantum fluctuations around equilibrium imply fluctuations of the learning variables, and the NN state, while at larger scales away from equilibrium (albeit still close), the system will rather behave classically as a learning system (to go back to equilibrium).

So we interpret [1] as a model that shows first and foremost how Physics + Information Theory coexist into a larger model. The model of [1] has its own dynamics. These dynamics may be seen as a model of how physical systems like the universe handle information conservation or just as an algorithm to derive the same outcome. More work is needed to determine that. If it is the former, this may actually be a way to answer why and how mathematics are so good at modeling the Universe as asked famously by Wigner [20], and others, and it would be aligned with Tegmark’s view [10]. Indeed, [1] would now amount to modeling how the universe remains close to thermodynamic equilibrium while always reacting to changes and fluctuation (e.g. random, thermal external, etc.) to catch up with the mathematical prescription aiming at optimizing the loss/cost function while evolving with minimum disruptions as captured by extremization of the entropy and action changes: physical systems take some “guessed optimized efforts” to catch up and follow the mathematics that describe them correctly and these mathematics are the reflection of this process. It is a direct application of Pontryagin’s maximum principles and theorem [25-27].

3. Conclusions

There have been already many hints of relationships between spacetime, entanglement, thermodynamics and information theory like treating the universe as universal Quantum Computer, encountering error correcting code in spacetime (including in [11]), deriving GR from spacetime properties in equilibrium and the relationships between gravity, entropy and entanglement entropy as well as the principle of conservation of information in Quantum Physics and the information paradox with Black holes. Information and Physics are closely related and this paper, along with many of its references, add to these observations.

We discussed how [1] can be understood intuitively as correctly modeling the universe in particular as a universal quantum computer and how its result may relate to Wigner’s question about why mathematics describe so well the physical world.

We refer to [5] for discussion on how to add entanglement to the model.

____

Cite as: Stephane H Maes, (2020), “Interpretation of “Neural Network as the World””, viXra:2012.0197v1, https://shmaesphysics.wordpress.com/2020/09/14/interpretation-of-neural-network-as-the-world/, September 14, 2020.

____

References:

[1]: Vitaly Vanchurin, (2018), ” The world as a neural network”, arXiv:2008.01540v1

[2]: Vitaly Vanchurin, (2018), “Covariant Information Theory and Emergent Gravity”, arXiv:1707.05004v4

[3]: Vitaly Vanchurin, (2020), “Towards a theory of machine learning”, arXiv:2004.09280v3

[4]: D. Acosta, P. Fernandez de Cordoba, J. M. Isidro, J. L. G. Santander, (2012), “Emergent quantum mechanics as a classical, irreversible thermodynamics”, arXiv:1206.4941v2

[5]: Stephane H Maes, (2020), “Implicit Multi-Fold Mechanisms in a Neural Network Model of the Universe”, viXra:2012.0191v1, https://shmaesphysics.wordpress.com/2020/09/12/implicit-multi-fold-mechanisms-in-a-neural-network-model-of-the-universe/, September 12, 2020.

[6]: Wikipedia, “Kolmogorov–Arnold representation theorem” https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Arnold_representation_theorem, Retrieved on September 14, 2020.

[7]: Wikipedia, “Universal approximation theorem”, https://en.wikipedia.org/wiki/Universal_approximation_theorem, Retrieved on September 14, 2020.

[8]: Seth Lloyd, (2006), “Programming the Universe: A Quantum Computer Scientist Takes on the Cosmos”, Alfred A. Knopf

[9]: Wigner, E. P. (1960). “The unreasonable effectiveness of mathematics in the natural sciences. Richard Courant lecture in mathematical sciences delivered at New York University, May 11, 1959”. Communications on Pure and Applied Mathematics. 13: 1–14.

[10]: Max Tegmark, (2007), “The Mathematical Universe”, arXiv:0704.0646v2

[11]: Stephane H. Maes, (2020), “Quantum Gravity Emergence from Entanglement in a Multi-Fold Universe”, viXra:2006.0088v1, https://vixra.org/pdf/2006.0088v1.pdf (June 9, 2020).

[12]: Wikipedia, “Kolmogorov–Arnold representation theorem” https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Arnold_representation_theorem, Retrieved on September 14, 2020.

[13]: Wikipedia, “Universal approximation theorem”, https://en.wikipedia.org/wiki/Universal_approximation_theorem, Retrieved on September 14, 2020.

(Added when pre-print was published on vixra.org)

[14]: Andre Ye, (2020), “Every Machine Learning Algorithm Can Be Represented as a Neural Network”, https://towardsdatascience.com/every-machine-learning-algorithm-can-be-represented-as-a-neural-network-82dcdfb627e3. Retrieved on December 19, 2020.

[15]: Wikipedia, “Pontryagin’s maximum principle”, https://en.wikipedia.org/wiki/Pontryagin%27s_maximum_principle. Retrieved on September 29, 2020.

[16]: “13 Pontryagin’s Maximum Principle”, http://www.statslab.cam.ac.uk/~rrw1/oc/L13.pdf. Retrieved on September 29, 2020.

[17]: Thayer Watkins, “The Nature of the Principle of Least Action in Mechanics”, https://www.sjsu.edu/faculty/watkins/minprin.htm. Retrieved on September 29, 2020.


[1] This is a republication of an appendix in [5], presented independently of notions of multi-fold universe that are not needed for the present analysis.

____

I thank my generous supporters on Patreon. If you like my work, publications, and opinions, please consider joining them. This way, you can support this research work done totally independent from any institution. Use the contact form if you prefer to help by putting together a grant or other type of funding.

83 thoughts on “Interpretation of “Neural Network as the World”

  1. With respect to https://www.scientificamerican.com/article/do-we-live-in-a-simulation-chances-are-about-50-50/: [1] and here we obtain a discrete spacetime. And yes it is directly linked to the supra luminous limit. However, rather than an arbitrary approximation or constraint to a simulation as proposed in teh Scientific American article, we motivate it from noncommutativity of teh multi-fold and random walk reconstruction of spacetime. We do not need the simulation argument to justify this result. That does not mean it is or it isn’t a simulation. It seems hard to take any position on that at this stage. But holographical aspects are also addressed on the paper on this web site and again logically derived (as facts) from multifold mechanisms (hence again no need to revert to simulation argument to justify it).

    Like

  2. QuBits emulation by NN:
    Gao, X. & Duan, L.-M. (2017) “Efficient representation of quantum
    many-body states with deep neural networks”. Nature Communications
    8, 662
    as well as: Bjarni Jónsson, Bela Bauer, Giuseppe Carleo, (2018), “Neural-network states for the classical simulation of quantum computing”, arXiv:1808.05232v1

    Like

  3. Related: DO WE LIVE IN A SIMULATION? ONE LITTLE KNOWN THEORY PROVES ELON MUSK WRONG
    Physicists believe the universe is unlikely to be simulated because they have tried to simulate it themselves for decades – and failed https://www.independent.co.uk/life-style/gadgets-and-tech/simulation-theory-elon-musk-pong-matrix-b1967844.html

    Our multi-fold theory actually would potentially support teh opposite of the paper as we also explained how to address the invoked theorem (see https://shmaesphysics.wordpress.com/2020/12/13/viable-lattice-spacetime-and-absence-of-quantum-gravitational-anomalies-in-a-multi-fold-universe/).

    Like

    1. Related: Reality is not a simulation and why it matters. Simulations all the way down – https://iai.tv/articles/reality-is-not-a-simulation-and-why-it-matters-auid-2343 and A PHYSICIST REJECTS THE IDEA THAT WE LIVE IN A SIM UNIVERSE – https://mindmatters.ai/2023/01/a-physicist-rejects-the-idea-that-we-live-in-a-sim-universe/.

      Also consider: PHILOSOPHER: WE CAN’T PROVE THAT WE AREN’T LIVING IN A SIMULATION – https://mindmatters.ai/2022/02/philosopher-we-cant-prove-that-we-arent-living-in-a-simulation/.

      Like

  4. Related: The Universe Can Bend the Laws of Physics All By Itself, Scientists Say
    A new theory suggests that the universe perpetuates itself by constantly adapting its own physical laws over time.- https://www.popularmechanics.com/science/a38539247/universe-evolves-laws-of-physics-by-itself/ and paper: The Autodidactic Universe – https://arxiv.org/pdf/2104.03902.pdf

    There are some issues IMHO with the proposal in teh context of multi-fold theory. While we could have couplings and constant changing(or even multiverses with different values), the multifold mechanisms for gravity and QFT for QED, QCD. Electroweak imply the range / r^(-2) laws. Learning would probably not work any more if another variation was introduced as suggested. More generally the arguments proposed by the authors work only abstractly, not in a universe where laws like gravity QFT etc. are explained by microscopic effects. These microscopic effects explain the laws, they can’t be tuned for discrete or continuous evolution of the laws…

    Like

  5. Related: An AI Just Independently Discovered Alternate Physics – https://www.sciencealert.com/ai-has-discovered-alternate-physics-on-its-own and paper: Discovering State Variables Hidden in Experimental Data – https://arxiv.org/pdf/2112.10755.pdf

    The science alert title and article is “as if!’, This is the result of a few main considerations:
    – not enough data and as a result overshoot of the predicted # of degrees of freedom (see table page 10 for example).
    – Pictorial/ video 2D evaluation
    – lack of generalization to other processes governed by same equations and physics.

    Noting more. It shows one can estimate degrees of freedom. It does not discover a new physics, it (still too) poorly [Not a negative connotation on the research] approximates existing one.

    Like

    1. Again about simulations: IS REALITY A SIMULATION? The nature of our reality is one of the greatest mysteries out there. – https://www.inverse.com/science/is-reality-a-simulation
      and Information Physics: The New Frontier – https://arxiv.org/pdf/1009.5161.pdf

      It is strange that none of these papers refer to the strongest hint (not that we claim at all that we are in a simulation): spacetime is discrete as show in https://shmaesphysics.wordpress.com/2020/06/09/paper-published-as-preprint-quantum-gravity-emergence-from-entanglement-in-a-multi-fold-universe/.

      Like

  6. Doubtful… Patterns yes, theories. Simulations, yes. More: doubtful as discussed in comments above: Will artificial intelligence ever discover new laws of physics? Algorithms can pore over astrophysical data to identify underlying equations. Now, physicists are trying to figure out how to imbue these “machine theorists” with the ability to find deeper laws of nature – https://shmaesphysics.wordpress.com/2020/09/14/interpretation-of-neural-network-as-the-world/

    Like

    1. About that paper: ill artificial intelligence ever discover new laws of physics? Algorithms can pore over astrophysical data to identify underlying equations. Now, physicists are trying to figure out how to imbue these “machine theorists” with the ability to find deeper laws of nature – https://shmaesphysics.wordpress.com/2020/09/14/interpretation-of-neural-network-as-the-world/:

      Nope.

      Law or models or simulations are only as good as model or simulation, and data,

      Remember https://www.sciencealert.com/ai-has-discovered-alternate-physics-on-its-own and https://arxiv.org/pdf/2112.10755.pdf? In that case, it is the result of a few main considerations:
      – not enough data and as a result overshoot of the predicted # of degrees of freedom (see table page 10 for example).
      – Pictorial/ video 2D evaluation
      – lack of generalization to other processes governed by same equations and physics.
      Noting more. It shows one can estimate degrees of freedom. It does not discover a new physics, it (still too) poorly [Not a negative connotation on the research] approximates existing one.

      I also like: On the fallacy of replacing physical laws with machine-learned inference systems – https://science-memo.blogspot.com/2021/04/on-fallacy-of-replacing-physical-laws.html

      Sure it may get (or approximate) a law correctly, but it will then be by luck.

      Like

  7. From time crystals to wormholes: When is a quantum simulation real?
    Physicists are using quantum computers to conjure various exotic phenomena and are claiming that their creations are truly real. The work is forcing us to ask challenging questions about the nature of quantum reality

    Like

  8. About the universe as a simulation and the second lay of information dynamics:
    – Could a new law of physics support the idea we’re living in a computer simulation? – https://phys.org/news/2023-10-law-physics-idea-simulation.html and paper: The second law of infodynamics and its implications for the simulated universe hypothesis – https://pubs.aip.org/aip/adv/article/13/10/105308/2915332/The-second-law-of-infodynamics-and-its
    – New physics law could predict genetic mutations – https://phys.org/news/2022-07-physics-law-genetic-mutations.html, and paper: Second law of information dynamics – https://pubs.aip.org/aip/adv/article/12/7/075310/2819368/Second-law-of-information-dynamics

    Like

Leave a comment

Design a site like this with WordPress.com
Get started