Monday, March 20, 2023

GPT physics papers, the next generation

This blog began with the idea of treating's fake physics papers seriously

Last year I found a powerful language model on the web (GPT-J), and tried generating whole papers using titles and abstracts generated by This was my first attempt; then this was my repeated exploration of a particular topic

Five months later, the era of ChatGPT began. (At the same time, the free web version of GPT-J stopped working.) Like millions of other users, I have been conducting many experiments with ChatGPT and Bing. 

But I only just thought of returning to the snarxiv challenge, using this new generation of tools. Here's my first attempt: "Quintessence at the Intermediate Scale Extremizes the Strong CP Problem"

The resulting "paper" has a palpably different flavor to those generated by GPT-J-6B. But first, let me describe how the paper was generated. The paper is too big for a single output from ChatGPT, so first I gave it the title and abstract, and asked it to generate a table of contents, then I manually asked it to generate each item listed in the table of contents, one after the other. 

You will note that it doesn't actually contain any equations or references. The style of the whole paper, in fact, resembles an abstract - merely declaring that certain things will be explained or shown, but not actually delivering on anything promised. 

On the other hand, the texts produced by GPT-J regularly contained both equations and references, but were far less coherent than what ChatGPT has written. 

The difference between GPT-J and ChatGPT is that ChatGPT, after being "pre-trained" on a vast corpus of writings, has then been conditioned so as to consistently present itself in the persona of a helpful assistant. GPT-J, on the other hand, was (I assume) a raw language model with pre-training only: presented with an input, it would immediately attempt to continue in the style and structure implied. As a result, GPT-J would directly output a (fictitious, incoherent) arxiv paper, complete with LaTeX markup. 

ChatGPT is far more logical and coherent in its output, thanks to intensive fine-tuning. As a result, its paper has a genuinely logical structure, but it also doesn't spontaneously produce equations and references, the way that GPT-J did. However, I'm sure it has the capacity to do so, if prompted appropriately. 

In June 2022, I wrote: 

"I suspect that in less than ten years, you'll be able to input a snarxiv abstract into an AI, and almost instantly get back an essay which really does its best to deliver coherently on the promised content."

It's now nine months later, and I think that a little experimentation with the ChatGPT API would rapidly yield papers combining the logical coherence of ChatGPT with the detailed creativity of GPT-J. How close one could come to the quality of a good arxiv paper is a deep question. 

Sunday, August 28, 2022

Antipodes of the standard model

The Feynman diagrams employed in perturbation theory represent particular contributions to the quantum sum over histories. Mathematically, they are integrals full of zeta functions and "polylogarithms" and many other interesting numbers and functions. There is even an algebra of ways to combine the diagrams, since particles exiting one scattering process can enter another; this allows two or more diagrams to be combined into one... In recent decades, this hidden world of mathematical relationships has been intensively studied, under the name of amplitudeology. 

One tool used to simplify these very complicated integrals is the "symbol" of the integral. This is something like a list of the elementary variables and functions appearing in the integral. From this list alone, one can reconstruct a significant part of the integral. 

Late last year, it was discovered that the symbol of one scattering process is the reverse of the symbol of another scattering process. That is, the variables and functions appearing in the path integral of the first process, appear in reverse in the path integral of the second process. This was deeply unexpected. The operation of reversing a symbol is formally a part of the "Hopf algebra" of the Feynman diagrams - there is an "antipode" operator that does this - but no one had envisaged that it might be physically meaningful

The scattering processes involved come from supersymmetric Yang-Mills theory. However, they have counterparts in the standard model, and the standard model counterparts of the antipodally dual amplitudes are also perplexing. One process is just gluons in, gluons out, but the other one has a Higgs involved along with the gluons. Gluons come from QCD, but the Higgs is associated with the electroweak sector and doesn't carry color charge - what is it doing in there? 

At this point I have nothing to say about the pure math of the antipodal duality, but I shall record a few thoughts about the appearance of the Higgs. 

First, let me clear about how this works in super-Yang-Mills theory. The fields in the "N=4" (fourfold extended supersymmetry) super-Yang-Mills theory studied by the amplitudeologists, can be called gluon, gluino, and sgluon. The gluon is a vector field, the gluino is a fermion field, and the sgluon is a scalar field. When this is mapped to the standard model, I assume that gluinos correspond to quarks, and that it's the sgluon which corresponds to the Higgs. 

Second, I'll mention how a Higgs boson is produced by "gluon fusion", in actual interactions that occur in the hadron collider. Basically, gluons fuse to create one side of a top quark loop, and a Higgs is emitted from the opposite vertex... One may approximate this interaction via a direct "gluon-gluon-Higgs" vertex, and I believe this corresponds to a gluon-gluon-sgluon vertex in super-Yang-Mills. \

OK, so, why would an amplitude with a Higgs in it, have a relationship to a pure QCD process? 

In this blog, I have occasionally touched on ways that strong interaction may be related to the electroweak interaction (in ways different from the usual grand unification of both gauge symmetries into a larger simple group). The idea that electroweak interactions could come from gauging the chiral symmetry of the strong interactions, and that this might naturally be so from a higher-dimensional perspective, is one of which I'm very fond. 

Another possibility is that the Higgs is actually toponium, top quark and top antiquark bound by something. Alejandro Rivero's observation that Z0 decay behaves a little like pion decay may be a point in favor of this, given that the Z0 gets its mass from a component of the Higgs field. 

Matti Pitk√§nen has suggested that the apparent color/electroweak duality implied by antipodal duality, might have something to do with the famous electric-magnetic duality of super-Yang-Mills. In this regard, I would draw attention to another idea of Alejandro's that has been mentioned many times in this blog, the "sBootstrap" which aims to derive all the fermions of the standard model, as fermionic superpartners of mesons and diquarks made of the five light flavors of quark. 

Out of many attempts to implement this combinatorial idea within a robust theoretical framework, one of my favorites has been Seiberg duality, in which high-energy N=1 super-QCD, resolves at low energies to a different N=1 gauge theory, in which an extra meson superfield has emerged. The idea here is something like this, that at high energies one has N=1 super-QCD with one heavy quark (the top) and five massless quarks, and that at low energies one has six massive quarks, and an emergent electroweak sector, with the leptons arising as mesino components of the meson superfield... But Seiberg duality is itself a form of electric-magnetic duality. 

All of these might serve as starting points, in a quest to confirm and understand, the possible presence of antipodal duality within the standard model. 

Saturday, July 2, 2022

"Three Fermion Generations from Octonions": another experiment

On the forums, there is a discussion about papers which try to get the three fermion generations "from octonions", somehow. I decided to seed GPT-J just with the title, "Three Fermion Generation from Octonions". In the end I ran the experiment eight times (in honor of the eight-ness of octonions), seven times at temperature 0.95, and once at temperature 0 (which makes the model deterministic). Here is my distillation of the most coherent ideas in the eight incomplete papers that were generated... 

1: each generation is associated with a different irreducible representation of SU(2), or maybe SU(3)

2: particles are classified by a spinor representation of octonions; generations have a quantum number 0, 1/2, 3/2 

3: links to 'Quantum gravity and charge renormalization' by David Toms

4: electroweak unification 'by considering octonions as the base ring'; quarks and leptons described by complex octonions 

5: in an octonionic free fermion system with three chiral generations, there are fermions with spin 1/2 and fermions with spin 0, and the masses come from 'the imaginary units of the octonions in the appropriate group representations', which are constructed using Young tableaux 

6: 'introduces an important ingredient, crucial for the stability of the compactification, namely a non-polynomial superpotential with a new “hair” for the complex structure of the Calabi-Yau manifold'

7: consider fermions in 4d space deformed by 'the co-product algebra of the octonions'

8 (temperature 0): 'the octonionic algebra is the algebra of the three generations of fermions'

Friday, June 3, 2022

"(P,q) Instantons" by O. H. Silverstein was launched in March 2010: a site which generates random imitations of arxiv abstracts. 

This blog was launched in June 2011 with an experiment: What happens if you take snarxiv abstracts seriously? What might the paper accompanying a given abstract, actually be about? 

In June 2020, OpenAI began to make GPT-3 available to the world: a "language model" trained on Internet text, which could write a whole essay, or other verbal production, given a short "prompt" to set it off. 

Now in June 2022, I have used GPT-J, another model inspired by GPT-3, to write a whole fictitious physics paper, using a snarxiv abstract as the prompt. Here is the result. 

The individual sentences make sense, but the resulting paper is not coherent on scales greater than a paragraph. But somehow I suspect that in less than ten years, you'll be able to input a snarxiv abstract into an AI, and almost instantly get back an essay which really does its best to deliver coherently on the promised content. 

Wednesday, April 20, 2022

A pole star becomes invisible

That Lubos Motl's famous blog, "The Reference Frame", is now closed to the public, deserves comment, since it is the best physics blog there is; the only place you see a physicist of his level, offering frank, intuitive, technical commentary on topics from the perennial (e.g. what is quantum spin) to the misguided (whether that's fads among his peers, or among fans of alternative physics) to the truly new and promising. 

Apparently he had monetized the blog with Google's AdSense; just before he closed it, AdSense kept telling him that various posts on climate, Covid, etc, were no longer acceptable. So it seems he's taken the whole thing private while he decides what to do, and/or has a break from the stresses of maintaining it. 

Maybe he'll be back, maybe not, but the disappearance of such a valuable resource needs to be recorded. 

Friday, July 16, 2021

Serendipitous sum rules

Today I shall report that I am rather more positive than I was, about the second "wrong" idea in the previous post. The reason is that the sum of all the masses in a multiplet, is a quite natural item to appear in a sum rule! So the relation would be, that the sum of the masses of the charged leptons, considered as a multiplet of a flavor or family symmetry, equals the sum of the neutron and proton masses, with the neutron and proton to be considered as an isospin doublet. 

What I still lack is a mechanism. I believe that (for example, in the linear sigma model), nucleon mass can be regarded as originating in the spontaneous breaking of chiral symmetry. Meanwhile, the Higgs-yukawa interactions in the standard model give the fermions their masses, once electroweak symmetry breaking occurs; so one might consider models in which chiral symmetry breaking triggers electroweak symmetry breaking (something which might also explain the order-of-magnitude similarity between the QCD scale and the Fermi scale). 

On the other hand, since the variations among the fermion masses derive from the yukawas, it might seem that the relevant symmetry-to-be-broken is the family symmetry, not the electroweak symmetry... Then there are other hints, like obtaining electroweak symmetry by gauging part of chiral symmetry (something which is formally common in chiral perturbation theory, I think), and the Rivero idea that the leptons are Goldstone fermions, superpartners of mesons. 

In other news, I will mention that tweeted out a fictitious paper whose content is serendipitously close to other things I have been thinking about. The very first posts on this blog were discussions of snarxiv papers, and if there was nothing else to blog about, I might have talked more about "Special Lagrangian Branes Wrapped on the Moduli Space of Squashed Lens Spaces". But I'll save that for another time. 

Sunday, June 27, 2021

Ideas, right and wrong

There are many things that I could or should post about here. I must mention the terrible news of Marni Sheppeard's death, which is a loss in so many ways. I blog about her when I can. 

There is also a backlog of ideas, waiting to be analysed and sorted. Today I just wanted to mention two lines of thought. 

One is the potential harmony between Rivero's sBootstrap, Dienes's misaligned supersymmetry, and the Veltman-like sum rule of Lopez-Castro - Pestieau - Garces Doz. 

The other is a cluster of thoughts about how to explain the 313 MeV scale in Brannen's version of the Koide formula. The main thought is: maybe it's a kind of Goldberger-Treiman relation. I think that thought is promising. 

Then there are some other thoughts about it which are surely wrong, but which I shall mention here. One is: what if the charged leptons are different forms of a mesino with a rest mass of 626 MeV, undergoing relativistic periodic motion in compact extra dimensions. You may ask: that's alright for the tauon, but what about the electron and muon, whose mass is less than that? Well, the "answer" is that they have imaginary momentum in the extra dimensions, and that overall there are three complex extra dimensions, like a Calabi-Yau... There's no way this is true, but it was too cute to not mention. 

My other wrong thought is this: If you take the trace of Brannen's mass matrix, you find that the sum of electron, muon, and tauon masses, equals two times a nucleon mass. But what if it's really a proton mass plus a neutron mass - the two members of the nucleon isospin doublet? Again, I think it's a cute idea, but it seems very unlikely that this is where the factor of 2 comes from.