[ExI] Why do the language model and the vision model align?

Jason Resch jasonresch at gmail.com
Sun Mar 1 20:55:43 UTC 2026


On Sun, Mar 1, 2026 at 2:20 PM John Clark <johnkclark at gmail.com> wrote:

> On Sun, Mar 1, 2026 at 1:26 AM Jason Resch via extropy-chat <
> extropy-chat at lists.extropy.org> wrote:
>
> *>> Give me a fundamental definition of the word "time" or even "change"
>>> using just pure mathematics and without using any ideas from physics, I'd
>>> really like to hear that!   *
>>>
>>
>> *> To get something like an "evolving 3 dimensional structure"
>> mathematically, you merely add another dimension, and use that dimension to
>> track how different states of that 3-dimensional structure such that
>> different states of it are different at different positions in that 4th
>> dimension,*
>>
>
> *They are both dimensions so why is time different from space? When Euclid
> or Pythagoras wanted to calculate the distance in flat space they didn't
> need a minus sign, but when Einstein needed to calculate the distance in
> flat Minkowski spacetime for special relativity he did need to include
> a minus sign. How come?*
>

Those are excellent questions, and the answer comes from the fact that in
our physical universe, all things travel at the same speed (the speed of
light). You can drop the minus sign and treat time as any other coordinate,
if you update your model with the assumption that the proper velocity
through spacetime of all objects is always c.

Imagine, the case that there was an extra dimension, in which everything
moved through it at exactly the same speed and exactly the same
dimension. Such a dimension would be seem invisible, since we would be
unable to move forward or backward relative to anyone else through it. It
would constitute a "phantom dimension".

Time in our universe is *almost* like such a phantom dimensions. Though we
cannot change our proper speed through spacetime, we can alter our
direction through spacetime. When we do so, we can "fall behind" others who
do not alter their trajectory through spacetime. For a visual reference,
imagine a highway in which every car travels at exactly 60 mph (no faster
and no slower). Should any car on this highway redirect its velocity away
from straight down the road (say to change langes) then it would fall
slightly behind relative to the other cars. And this is exactly what
happens with time-dilation. In fact, there is a perfect geometric analogue
which gives the same exact calculations you find in relativity. See this
diagram for reference:

https://cdn.alwaysasking.com/wp-content/uploads/2020/07/twin-paradox-spacetime-768x771.webp
Two twins, Sam and Pam, go through the twin paradox. Sam stays on Earth for
10 years while his sister Pam travels to the star Proxima Centauri and back
at 80% the speed of light. Sam (in blue) remains on Earth and uses all of
his speed to “travel through time“. Pam (in pink) travels at 80% the speed
of light to reach Proxima Centauri 4 light years away. The trip there takes
5 years from Sam’s point of view, but only 3 from Pam’s point of view. The
proper length of both Sam’s and Pam’s paths through spacetime is 10 light
years, but because Pam used 80% of her speed to travel through space, she
could only use 60% of her speed to “travel through time”. So while Sam aged
ten years, Pam aged only six.
You can draw any path you like for Pam as she goes through spacetime, and
so long as you keep the length of the path limited to 10 ly, you will be
able to exactly determine her age at the end of the journey. It is a
perfect model of the results relativity predicts, but a vastly simpler
geometric model (just uses light-years vs. years as the coordinates).

You only need to introduce a negative sign to the coordinate system if you
presume that when at rest one has a proper velocity of 0 through spacetime.
But so much of relativity becomes so much more intuitive and makes so much
more sense, when you consider tau to be another coordinate, and all 4
dimensions of spacetime as equally spatial. Length contraction, clock
synchronization, time dilation, relativity of simultaneity, all fall out as
immediate inuitive consequences of this.

See the Book Relativity Visualized, for more on this:
https://www.amazon.com/Relativity-Visualized-Lewis-Carroll-Epstein/dp/093521805X
Or my article on time, which presents relativity using the same methods as
in this book: https://alwaysasking.com/what-is-time/


>
> *>> The fundamental difference between a book and a Turing Machine is that
>>> one can change but the other cannot, so one can perform a calculation but
>>> the other cannot. And that's also why Nvidia is the most valuable company
>>> in the world and Penguin Random House is not.*
>>>
>>
>> *> More attempts at introducing red herrings.*
>>
>
> *If that's the best rebuttal you can come up with then I guess I won that
> round.  *
>

It that helps you sleep at night. But note that when I say "a book that
describes physics is not the physical universe" and give the anology: "a
book describes a Turing machine is not a Turing machine", you keep
returning to saying "Books can't compute anything." -- That's a given, I
agree a book can't compute anything. But it's a non-sequitor, a distracting
point unconnected from my example, hence why I say it is a red herring.


>
>
>
>>
>>>>>> *"It gradually hit me that this illusion of randomness business
>>>>>> really wasn’t specific to quantum mechanics at all. Suppose that some
>>>>>> future technology allows you to be cloned while you’re sleeping, and that
>>>>>> your two copies are placed in rooms numbered 0 and 1. When they wake up,
>>>>>> they’ll both feel that the room number they read is completely
>>>>>> unpredictable and random."-- Max Tegmark in “Our Mathematical Universe”
>>>>>> (2014)*
>>>>>>
>>>>>
>>>>> *>> And I agree with Tegmark's above statement 100%. What I very
>>>>> strongly disagree with is the statement "it's impossible to predict what
>>>>> number "YOU" will see" is a profundity. It's a silly thing to say because
>>>>> in this context the word "you" is undefined. *
>>>>>
>>>>
>>>> *> If you agree with Tegmark, then you agree with Marchal*
>>>>
>>>
>>> *NO!! The way Marchal threw around personal pronouns made it very clear
>>> that the man LITERALLY didn't know what he was talking about, I don't agree
>>> with everything Tegmark said in his book but, unlike Marchal, he
>>> did LITERALLY understand the words he was using. *
>>>
>>
>>
>> *> Here is Tegmark
>> <https://archive.org/details/ourmathematicalu0000tegm/page/194/mode/2up?q=%22It+gradually+hit+me+that+this+illusion+of+randomness%22>.
>> I have highlighted the pronouns for your convenience, since you seem to
>> have missed them:*
>>
>> *Page 194 — It gradually hit me that this illusion of randomness business
>> really wasn’t specific to quantum mechanics at all. Suppose that some
>> future technology allows you to be cloned while you’re sleeping, and that
>> your two copies are placed in rooms numbered 0 and 1 (Figure 8.3). When
>> they wake up, they’ll both feel that the room number they read is
>> completely unpredictable and random. If in the future, it becomes possible
>> for you to upload your mind to a computer, then what I’m saying here will
>> feel totally obvious and intuitive to you, since cloning yourself will be
>> as easy as making a copy of your software. If you repeated the cloning
>> experiment from Figure 8.3 many times and wrote down your room number each
>> time, you’d in almost all cases find that the sequence of zeros and ones
>> you’d written looked random, with zeros occurring about 50% of the time.*
>>
>> Which "you" is Tegmark referring to when he's talking about dozens of
>> clones being duplicated?
>>
>
>
> *Tegmark makes it very clear that when he refers to "you" he is referring
> to anybody or anything that remembers being John Clark before the
> duplicating process occurred.  By contrast Marchal never made it clear what
> he meant by "you", or much of anything else for that matter.  *
>

It was clear to me what Tegmark, Muller, and Bruno meant. I am sorry you
were not able to understand Bruno in the 10+ years you spent debating him.
But I am happy that you find Tegmark's language clear enough that you can
now understand Bruno's point.



>
>>> *> Since you still seem confused, I put this together today, and I think
>> it will help you understand what I mean by "derive"*
>>
>> *https://drive.google.com/file/d/1wHZPpB1QOrQU5HmHVOP-FUIq5NL1WPU3/view?usp=sharing*
>> <https://drive.google.com/file/d/1wHZPpB1QOrQU5HmHVOP-FUIq5NL1WPU3/view?usp=sharing>
>>
>
> *If 38 pages are needed to explain what you mean by a word as simple as
> "derive" then communicating with you is going to be very difficult. *
>
>
>> *>>> You may also find this useful: *
>>>
>>> >> *Bekenstein-Hawking entropy*
>>> <http://www.scholarpedia.org/article/Bekenstein-Hawking_entropy>
>>>
>>
>> *>It's a broken link,*
>>
>
> *Sorry. Try this:  *
>
>
> *http://www.scholarpedia.org/article/Bekenstein-Hawking_entropy
> <http://www.scholarpedia.org/article/Bekenstein-Hawking_entropy> *
>

Thanks. I was trying to find this reference the other day.


>
>
>
>> *>> The Bekenstein Bound is a physics law that sets a limit on the
>>> maximum amount of information (entropy) that can be contained within a
>>> given area (not the volume) of space. The formula is S ≤ 2πKRE/hc  where R
>>> is the radius, E is the total energy (including mass), and π,K,h and c are
>>> all constants. But it's important to understand the difference between the
>>> Entropy Bound (a container's capacity) and the Actual Entropy (how much
>>> stuff is actually inside the container). *
>>>
>>
>> *>Yes. But note the bound is defined by E*R. In other words mass-energy *
>> radius. The larger the radius, even for the same mass-energy, the higher
>> the bound is.*
>>
>
> *T**he larger an area (not the volume) that encloses a sphere the larger
> the maximum amount of information that can be encoded on its surface, but
> that just tells you the Bekenstein Bound, the maximum amount that could be
> stored, **it doesn't tell you how much information is actually stored. To
> know that you not only need to know the area of a sphere you also have to
> know the mass of it.*
>
>
>
>> *>> A large, spread-out cloud of gas has a very high Entropy Bound
>>> because its large area is capable of holding a lot of information, a.k.a.
>>> entropy, but its Actual Entropy could be quite low if mass of the gas is
>>> small and smoothly distributed. A Black Hole of the same mass has a much
>>> lower Entropy Bound than the large cloud because its radius R is small and
>>> thus so is its area, BUT small though it is the Black Hole has maxed out
>>> that bound. So if you want a given amount of mass to encode as much
>>> information as is physically possible then you'll need to concentrate that
>>> mass until it turns into a Black Hole.*
>>>
>>
>> *> You are missing a key qualifier (added in blue):*
>> *"if you want a given amount of mass to encode as much information into a
>> given volume as is physically possible then you'll need to concentrate that
>> mass until it turns into a Black Hole."*
>>
>
> *If a given area of a sphere (NOT its VOLUME) encodes as much information
> as is physically possible on the sphere's surface then it's as massive as a
> black hole because it is a black hole. *
>

You keep returning to this other red herring of area vs. volume. I've said
repeatedly that I agree with that. Why do you keep mentioning it?

>
>
> *> Note that two atoms can encode more information than exists in a
>> stellar black hole, so long as you have unlimited volume in which to place
>> them.*
>>
>
> *Two atoms in an unlimited volume cannot form a black hole, they'd need to
> be placed ridiculously close to each other. And a  stellar black hole has
> far more than two atoms worth of mass-energy .*
>

Yes, but if you read the Bekenstein bound equation you will see that
increasing R enables you to increase the amount of information that can be
represented.

Let's say the stellar mass blackhole has 10^77 bits of information.

We can encode information using 2 atoms as follows: to encode the bit
string S using the two atoms, represent S as a distinct number N, and place
the 2 atoms N meters apart.

So long as the space available for placing these atoms N meters apart is
unlimited, then, in principle, any amount of information can be stored
using just these two atoms, whether it is a whole hard-drive's worth, or a
whole stellar-mass-black-hole's worth. It is counterintuitive, but this is
a direct implication of the Bekenstein bound formula.

I hope that this example enables you to see the importance of considering
the importance of "R" in the bound.



>
> *> the current entropy of our universe remains far below its maximum
>> possible entropy.*
>>
>
> *Good thing too, maximum possible entropy will only occur at the heat
> death of the universe. *
>


But I wonder if such a heat death is possible if the universe is always
expanding (and thus always making room for more entropy).

See:
https://www.informationphilosopher.com/solutions/scientists/layzer/growth_of_order/chaisson.jpg
for example. If growth of S_max always outpaces growth of S, then there
will be no final heat death.

Jason

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20260301/c45406bc/attachment.htm>


More information about the extropy-chat mailing list